frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Ask HN: How are you checking if your LLM is giving customers the right answer?

2•navaed01•1d ago
Something that’s been bothering me is observability with LLMs and how to check it’s giving customers the right answer.

There seems to be multiple failure points: hallucinations, partial responses (missing facts), saying information does not exist, response accuracy depends on how and what is being asked.

How are you measuring this in production today? - Thumbs up/ down seems like a weak signal - Running a sample of ‘known queries’ Assumes you know what is being asked.

What have you tried that works for you?

Partnership Between XAI and Telegram

https://twitter.com/durov/status/1927705717626003759
1•cheptsov•1m ago•0 comments

Accessible method for maize bioengineering could open doors for crop innovation

https://phys.org/news/2025-05-accessible-method-maize-bioengineering-doors.html
1•PaulHoule•2m ago•0 comments

Bash script for daily backups from PostgreSQL Docker containers

https://news.onbrn.com/bash-script-for-daily-backups-from-postgresql-docker-containers/
1•BrunoBernardino•2m ago•0 comments

AI Prompts That Will Make Your Status Updates Get Read

https://projectmanagementcompass.substack.com/p/ai-prompts-status-update
1•wmeller•5m ago•0 comments

Could 'pausing' cell death be the final frontier in medicine?

https://www.ucl.ac.uk/news/2025/may/could-pausing-cell-death-be-final-frontier-medicine-earth-and-beyond
1•geox•5m ago•0 comments

California has got good at building giant batteries

https://www.economist.com/united-states/2025/05/22/california-has-got-really-good-at-building-giant-batteries
1•chiffre01•7m ago•0 comments

This helps speakers turn their sessions into lead funnels

https://speakerstacks.com
1•punchtownparry•9m ago•0 comments

China's first 6nm domestic GPU with purported RTX 4060-like performance

https://www.tomshardware.com/pc-components/gpus/chinas-first-6nm-domestic-gpu-with-purported-rtx-4060-like-performance-has-powered-on
1•doener•10m ago•0 comments

Ovld – Efficient and featureful multiple dispatch for Python

https://github.com/breuleux/ovld
1•breuleux•10m ago•0 comments

Scientists Have Clear Evidence of Martian Atmosphere 'Sputtering'

https://www.sciencealert.com/scientists-have-clear-evidence-of-martian-atmosphere-sputtering
2•speckx•12m ago•0 comments

An untrusted layer of chatbot AI is an obvious disaster waiting to happen

https://macwright.com/2025/05/29/putting-an-untrusted-chat-layer-is-a-disaster
2•panic•13m ago•0 comments

Intel wins jury trial over patent licenses in $3B VLSI fight

https://www.reuters.com/legal/litigation/intel-wins-jury-trial-over-patent-licenses-3-billion-vlsi-fight-2025-05-29/
1•someothherguyy•14m ago•0 comments

Gradio Agents and MCP Hackathon June 2-8

1•dubrado•15m ago•0 comments

When Fine-Tuning Makes Sense: A Developer's Guide

https://getkiln.ai/blog/why_fine_tune_LLM_models_and_how_to_get_started
1•scosman•16m ago•0 comments

A man who sailed round the world with a chicken (2019)

https://www.theguardian.com/global/2019/apr/21/why-did-the-chicken-cross-the-globe-french-sailor-guirec-soudee-monique
1•NaOH•17m ago•0 comments

The Outlaw escaping from prisons in protest against inf detention (2024)

https://www.theguardian.com/society/2024/jan/01/total-isolation-uk-prisoners-12-year-protest-against-indefinite-sentences
2•huijzer•18m ago•0 comments

Lockheed,Boeing and Northrop Will Be the Reason Why US Could Lose the Next War

https://www.eurasiantimes.com/lockheed-boeing-and-rtx-corporation/
5•fsagx•19m ago•1 comments

Meta could soon start building tech for the US Army

https://www.engadget.com/big-tech/meta-could-soon-start-building-tech-for-the-us-army-184405058.html
2•speckx•20m ago•0 comments

Quantum Computing Could Break BTC Encryption Far Easier Than Intially Thought

https://www.coindesk.com/tech/2025/05/27/quantum-computing-could-break-bitcoin-like-encryption-far-easier-than-intially-thought-google-researcher-says
5•donsupreme•21m ago•0 comments

Sam Altman and Jony Ive Will Force A.I. Into Your Life

https://www.newyorker.com/culture/infinite-scroll/sam-altman-and-jony-ive-will-force-ai-into-your-life
2•littlexsparkee•22m ago•1 comments

Why Agentic AI Beats Traditional Workflow Automation

https://rxhl.notion.site/Why-Agentic-AI-Beats-Traditional-Workflow-Automation-2017e647613e805c9d61ce28768f9faf
2•baxkl•27m ago•0 comments

San Francisco Public Schools Convert F's to C's, B's to A's in Equity Push

https://www.newsweek.com/san-francisco-public-schools-equity-homework-2078003
4•the_decider•29m ago•2 comments

Recent Disruptive Changes from Setuptools

https://lwn.net/Articles/1020576/
2•zahlman•30m ago•1 comments

Developer Uncertainty

https://www.propelauth.com/post/developer-uncertainty
5•aisrael•32m ago•0 comments

True Stories of Authorization Nightmares

https://www.osohq.com/post/nightmares-of-authorization-pwned-password-data-pilferer
3•meghan•33m ago•0 comments

"Death literacy" can help quell fears of dying

https://www.theguardian.com/wellness/2025/may/29/what-happens-when-you-die
2•tadaima•33m ago•0 comments

HHS cancels nearly $600M Moderna contract on vaccines for flu pandemics

https://www.statnews.com/2025/05/28/moderna-flu-vaccine-development-cancelled-by-hhs-mrna-platform-offers-speedy-pandemic-response/
4•divbzero•33m ago•0 comments

Trump tariffs reinstated by appeals court for now

https://www.cnbc.com/2025/05/29/blocked-trump-tariffs-trade-court-appeal.html
4•MilnerRoute•36m ago•0 comments

Ask HN: Career Plateau: Looking for Advice on How to Break Through

2•vaderyondu•36m ago•2 comments

How I Code

https://ethang.dev/blog/how-i-code/
1•eglove•41m ago•0 comments