frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Subtle Failure Modes I Keep Seeing in Production‑Grade AI Systems

https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md
6•TXTOS•19h ago
Hi HN,

Over the past two years I’ve built and debugged a fair number of production pipelines—mainly retrieval‑augmented generation stacks, agent frameworks, and multi‑step reasoning services. A pattern emerged: most incidents weren’t outright crashes, but silent structural faults that slowly compromised relevance, accuracy, or stability.

I began logging every recurring fault in a shared notebook. Colleagues started using the list for post‑mortems, so I turned it into a small public reference: 16 distinct failure modes (semantic drift after chunking, embedding/meaning mismatches, cross‑session memory gaps, recursion traps, etc.). The taxonomy isn’t academic; each item references a real outage or mis‑prediction we had to fix.

Why share it?

Common vocabulary – naming a failure mode makes root‑cause discussions faster and less hand‑wavy.

Earlier detection – several teams now check new features against the list before shipping.

Community feedback – if something is missing or misclassified, I’d rather learn it here than during another 3 a.m. incident.

The reference has already helped a few startups (and my own projects) avoid hours of trial‑and‑error. If you work on LLM infrastructure, you might find a familiar bug—or a new one to watch for. The link to the full table and brief write‑ups is in the “url” field of this Show HN post.

I’m not selling anything; it’s MIT‑licensed text. Comments, critiques, or additional failure patterns are very welcome.

Thanks for taking a look.

Comments

tgrrr9111•18h ago
Wow

God I needed this:)

Been wrangling a RAG pipeline for the past few weeks and I swear the model looks like it’s working, but then drops logic mid-sentence, forgets context it saw 10 seconds ago, or hallucinates citations from chunks that were actually relevant — just… semantically wrong…….

The worst part? No errors. Nothing crashes. You just sit there wondering if you’re going crazy or if “LLMs are just like that.”

Reading your list was like watching someone read my bug reports back to me, but actually organized. Especially the stuff on memory gaps and “interpretation collapse” — we’ve hit those exact issues and kept patching them with duct tape (reranking, re-chunking, embedding tweaks, all the usual).

So yeah, big thanks for putting this together. Even just having the names of these failure modes helps explain things to my team.

MIT license is a cherry on top. Subscribed.

TXTOS•17h ago
Yep. Been there.

Built the rerankers, stacked the re-chunkers, tweaked the embed dimensions like a possessed oracle. Still watched the model hallucinate a reference from the correct document — but to the wrong sentence. Or answer logically, then silently veer into nonsense like it ran out of reasoning budget mid-thought.

No errors. No exceptions. Just that creeping, existential “is it me or the model?” moment.

What you wrote about interpretation collapse and memory drift? Exactly the kind of failure that doesn’t crash the pipeline — it just corrodes the answer quality until nobody trusts it anymore.

Honestly, I didn’t know I needed names for these issues until I read this post. Just having the taxonomy makes them feel real enough to debug. Major kudos.

Show HN: Add Travel Time – Auto Travel Time in Google Calendar

https://www.addtraveltime.com
1•benklinger•3m ago•0 comments

Show HN: Handelsregister.ai – Dev-friendly API for the German business registry

https://handelsregister.ai/de
1•padho•4m ago•0 comments

AI-Designed Enzymes Break Down Plastic in Hours

https://earth.org/plastic-eating-enzyme/
1•karlperera•11m ago•1 comments

Sharding Postgres at Network Speed

https://pgdog.dev/blog/sharding-postgres-at-network-speed
1•GarethX•15m ago•0 comments

KIRA project launches Germany's first autonomous public transport shuttles

https://urban-mobility-observatory.transport.ec.europa.eu/news-events/news/kira-project-launches-germanys-first-autonomous-public-transport-shuttles-2025-06-13_en
1•taubek•17m ago•0 comments

Claude Code: My Most Trusted Coworker and My Worst Enemy

https://lopezb.com/articles/claude-code-my-most-trusted-coworker-and-my-worst-enemy
1•GarethX•18m ago•0 comments

Lethal Cambodia-Thailand border clash linked to cyber-scam slave camps

https://www.theregister.com/2025/07/31/thai_cambodia_war_cyberscam_links/
1•romaniitedomum•23m ago•0 comments

Ask HN: How do you measure "AI slop"?

2•crakhamster01•23m ago•1 comments

Agntcy: Building Infrastructure for the Internet of Agents

https://agntcy.org
1•thebeardisred•26m ago•0 comments

Gödel: The Limits of Logic and the Foundations of Modern Mathematics

https://quantumzeitgeist.com/godels-incompleteness-theorems/
1•bryanrasmussen•30m ago•1 comments

Categorising My Daily Todo List with Deepseek-R1

https://www.bentasker.co.uk/posts/blog/software-development/ai-todo-list-categorisation.html
2•furkansahin•32m ago•0 comments

Dine and dash mental health toll on restaurant staff

https://www.bbc.co.uk/news/articles/cjd24ky4818o
1•mellosouls•33m ago•0 comments

LangExtract: A Gemini powered information extraction library

https://developers.googleblog.com/en/introducing-langextract-a-gemini-powered-information-extraction-library/
2•thebeardisred•33m ago•0 comments

C++: "model of the hardware" vs. "model of the compiler" (2018)

http://ithare.com/c-model-of-the-hardware-vs-model-of-the-compiler/
1•oumua_don17•35m ago•0 comments

If the Coronation of Charles II Had Event Marketing

https://www.youtube.com/watch?v=MtJvrjiSmds
1•zb9461•36m ago•0 comments

Dificulties with development of anonymous location bsed chat App

http://158.101.167.252/dificulties-with-development-of-anonymous-location-bsed-chat-app/
1•aizej•40m ago•1 comments

I Built a Modern CSS Formatter to Replace CSSComb

https://n8d.at/old-fashioned-css-formatter-a-modern-successor-to-csscomb/
2•Birkoff•40m ago•1 comments

Leaving LinkedIn

https://www.onepict.com/20250730-linkedin.html
1•ColinWright•43m ago•0 comments

Printer Tracking Dots

https://en.wikipedia.org/wiki/Printer_tracking_dots
4•Michelangelo11•43m ago•0 comments

Requests for Startups F25

https://www.ycombinator.com/rfs
2•doppp•50m ago•0 comments

Web Scraping Challenges: 12 Barriers and How to Beat Them

https://whoerip.com/blog/web-scraping-challenges/
1•whoerip•50m ago•0 comments

Our Small Agency Generated 11 Tons of CO2 Using AI–Here's Our Next Step

https://www.davidlambauer.de/how-discovering-our-ais-co2-impact-changed-my-perspective/
1•herrmaier•51m ago•0 comments

Invincible Title Card Generator

https://www.invincibletitlecardgenerator.com/
1•cnych•52m ago•0 comments

MyUniverse

https://myuniverse-zche.onrender.com/
1•immercato•54m ago•0 comments

Woman Who Died of Heart Disease in ICE Custody Wasn't Allowed to See Doctor

https://reason.com/2025/07/30/woman-who-died-of-heart-disease-in-ice-custody-reportedly-told-son-she-wasnt-allowed-to-see-doctor-for-chest-pains/
4•perihelions•54m ago•0 comments

Zigzag Number Spiral

https://susam.net/zigzag-number-spiral.html
1•susam•59m ago•0 comments

Trump Administration Pushes for Clear Crypto Rules in New Report

1•AbdulHype•1h ago•0 comments

Labubu

https://en.wikipedia.org/wiki/Labubu
2•downboots•1h ago•0 comments

Gerald Jay Sussman on Flexible Systems (2021) [video]

https://www.youtube.com/watch?v=JbyAcZf6tds
1•tosh•1h ago•0 comments

Show HN: VibeTree – Parallel Claude coding with Git worktrees

https://github.com/sahithvibudhi/vibe-tree
2•vibudhi•1h ago•0 comments