frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: How do you integration-test AI / LLMs?

1•tom1337•2h ago
We’re currently debating how to properly test integrations with external LLMs in our software.

At the moment, all LLM calls are mocked in unit and end-to-end tests. Recently, we updated a model version and it started returning responses that no longer matched our expected schema (we validate outputs strictly) and thus returned in errors. Because all tests used mocked responses, this only surfaced in production.

In hindsight, a simple integration test that sends a real (non-mocked) request to the LLM provider would probably have caught this.

One idea is to have a test suite which sends each prompt to the LLM provider and checks whether the response matches the expected schema. This has its own issues since LLMs are inherently nondeterministic and these tests might be flaky, but I’m currently lacking better ideas.

Curious to hear how others approach this.

Ask HN: What's a book that fundamentally altered your mental models

1•brihati•1m ago•0 comments

JSON-complete data formats and programming languages

https://lemire.me/blog/2025/12/20/json-complete-data-format-and-programming-languages/
1•ibobev•2m ago•0 comments

What If Readers Like A.I.-Generated Fiction?

https://www.newyorker.com/culture/the-weekend-essay/what-if-readers-like-ai-generated-fiction
2•bookofjoe•3m ago•2 comments

Intertapes – collection of found cassette tapes from different locations

https://intertapes.net/
1•wallflower•4m ago•0 comments

I Let My Email Go for a Month and Now It Is Crushing My Will to Live

https://whatever.scalzi.com/2025/12/17/i-let-my-email-go-for-a-month-and-now-it-is-crushing-my-wi...
1•surprisetalk•4m ago•0 comments

E.W.Dijkstra Archive

https://www.cs.utexas.edu/~EWD/welcome.html
2•surprisetalk•4m ago•0 comments

Formalization of Erdős Problems

https://xenaproject.wordpress.com/2025/12/05/formalization-of-erdos-problems/
2•surprisetalk•4m ago•0 comments

Bell Labs Won Its First Nobel Prize

https://www.construction-physics.com/p/how-bell-labs-won-its-first-nobel
1•surprisetalk•4m ago•0 comments

Happy Solsthelion

1•ColinWright•5m ago•0 comments

Star Guage

https://en.wikipedia.org/wiki/Star_Gauge
1•skogstokig•5m ago•0 comments

CO2 Batteries That Store Grid Energy Take Off Globally

https://spectrum.ieee.org/co2-battery-energy-storage
1•rbanffy•6m ago•0 comments

Show HN: Lyrics to Rolling WebVTT Converter

https://gitlab.com/9o1d/vtt
1•9o1d•6m ago•0 comments

Not a Pipe: The Treachery of Images – Sublius Edition

https://substack.com/inbox/post/182240590
1•spacebacon•8m ago•0 comments

Show HN: Passkeybot.com – add passkey auth with a few server side HTTP handlers

https://github.com/emadda/passkeybot
2•emadda•9m ago•0 comments

Adaptation of Agentic AI

https://arxiv.org/abs/2512.16301
2•Anon84•10m ago•0 comments

Titanic Digital Twin: Exploring the Reality Capture Process

https://blog.lidarnews.com/titanic-digital-twin-reality-capture/
1•Neuronaut•11m ago•0 comments

A twelve-year-old selects his favorite tech and design picks of 2025

https://micahblachman.beehiiv.com/p/my-favorite-tech-and-design-picks-of-2025
1•subdomain•11m ago•0 comments

Show HN: Batch Image Crop – Online, Fast, Private Tool to Crop Images

https://batchimagecrop.com/
1•WanderZil•13m ago•0 comments

Stephen Sondheim, Puzzle Maestro

https://www.newyorker.com/magazine/2025/12/22/stephen-sondheim-puzzle-maestro
2•fortran77•14m ago•1 comments

ARIN Public Incident Report – 4.10 Misissuance Error

https://www.arin.net/announcements/20251212/
7•immibis•14m ago•0 comments

Federated Package Management and the Zooko Triangle

https://nesbitt.io/2025/12/21/federated-package-management.html
2•zdw•16m ago•0 comments

Extreme brainstorming questions to trigger new, better ideas (2022)

https://longform.asmartbear.com/extreme-questions/
1•Brajeshwar•17m ago•0 comments

Don't Shave That Yak (2005)

https://seths.blog/2005/03/dont_shave_that/
2•Brajeshwar•18m ago•0 comments

Never Use Pixelation to Hide Sensitive Text (2014)

https://dheera.net/posts/20140725-why-you-should-never-use-pixelation/
1•basilikum•18m ago•0 comments

Presenting the Case That the Future Will Be Unrecognizable

https://secondthoughts.ai/p/the-unrecognizable-age
2•cjbarber•18m ago•1 comments

Beginner's Guide to Arguing Constructively (2020)

https://liamrosen.com/arguments.html
2•Brajeshwar•18m ago•0 comments

China's Moore Threads unveils new chip in homegrown AI race

https://asia.nikkei.com/business/tech/semiconductors/china-s-moore-threads-unveils-new-chip-in-ho...
3•teleforce•20m ago•0 comments

Green Production of Amino-Acid-Derived N-Doped Graphene for Vitrimer Composites

https://pubs.acs.org/doi/10.1021/acssuschemeng.5c09378
1•westurner•21m ago•3 comments

Everyone knew about Walliams: another powerful man protected at women's expense

https://millihill.substack.com/p/everyone-knew-about-walliams
1•binning•23m ago•0 comments

Show HN: BetterQR – I got tired of $20/mo+ subscriptions for simple QR codes

https://www.betterqr.app
3•dzrmb•24m ago•0 comments