frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Most Agentic AI failures I've debugged turned out to be ingestion drift

2•wehadit•38m ago
Over the last few months, we’ve been working on creating an autonomous Agentic AI, and something unexpected kept showing up. I went in thinking the issues were with embeddings or the retriever, but the root cause was usually ingestion drifting upstream.

Some patterns that kept repeating: • PDFs extracting differently after a small template or export tool change • headings collapsing or shifting levels • hidden characters creeping into tokens • tables losing their structure • documents updated without being re-ingested • different converters producing slightly different text layouts

We only noticed the drift once we started diffing extraction output week-to-week and tracking token count variance. Running two extractors on the same file also revealed inconsistencies that weren’t obvious from looking at the text.

Even with pinned extractor versions, mixed-format sources (Google Docs, Word, Confluence exports, scanned PDFs) still drifted subtly over time. The retriever was doing exactly what it was told, the input data just wasn’t consistent anymore.

Curious if others have seen this. How do you keep ingestion stable in production RAG/Agentic AI systems?

Comments

chasing0entropy•9m ago
This is by design. AI that has consistent, reliable, accurate output is boring

Show HN: FT-Lab – Lightweight TinyLlama Fine-Tuning (Full FT / LoRA / QLoRA)

https://github.com/REICHIYAN/ft_lab
1•Sai-HN•6m ago•0 comments

Everything that is wrong in museums starts with wall labels

https://www.aaronland.info/weblog/2025/11/20/cafeteria/#usf
1•panic•8m ago•0 comments

Show HN: AI slides and presentation coaching

https://eloquentiq.vercel.app
1•mdev23•9m ago•0 comments

A pragmatic guide to LLM evals for devs

https://newsletter.pragmaticengineer.com/p/evals
1•sren•9m ago•0 comments

Now Watch Me Read

https://www.newyorker.com/culture/the-lede/performative-reading
1•petethomas•10m ago•0 comments

Three tips for easy container deployments on AWS

https://www.processfoundry.io/insights/three-tips-container-deployments-aws
1•christian-scott•10m ago•0 comments

Show HN: Wedding Guest Ranker

https://weddingguestranker.com/
1•etothepii•10m ago•0 comments

Finding Gene Cernan's Missing Moon Camera

https://www.spacecamera.co/articles/2020/3/3/gene-cernans-missing-lunar-surface-camera
1•theodorespeaks•12m ago•0 comments

Irys Photos – Social photography app

https://www.irysphotos.com
1•lylo•14m ago•0 comments

Show HN: Veru – open-source AI citation auditor using OpenAlex

https://github.com/Yinghao-Guan/Veru
1•guaguaaaa•17m ago•0 comments

The Prosecution of Roger Ver: A Lawfare Case Study

https://solari.com/the-prosecution-of-roger-ver-a-lawfare-case-study/
1•salkahfi•20m ago•0 comments

Vibe Coding: Empowering and Imprisoning

https://www.anildash.com/2025/12/02/vibe-coding-empowering-and-imprisoning/
1•zdw•21m ago•0 comments

Running Linux on a RiscPC – why is it so hard?

https://thejpster.org.uk/blog/blog-2025-12-02/
1•zdw•23m ago•0 comments

The Rise and Fall of the H-1B Visa – American Affairs Journal

https://americanaffairsjournal.org/2025/11/the-rise-and-fall-of-the-h-1b-visa/
1•bilsbie•25m ago•0 comments

Show HN: TrailWrightQA – local-first, AI-assisted Playwright UI testing

https://github.com/marktl/TrailWrightQA
1•marktl•30m ago•0 comments

Accommodation Nation: America's colleges have an extra-time-on-tests problem

https://www.theatlantic.com/magazine/2026/01/elite-university-student-accommodation/684946/
1•petethomas•34m ago•0 comments

A Trajetória Do Assistente Social No Contexto Do Terceiro SETOr

https://minutocaptamais.substack.com/p/a-trajetoria-do-assistente-social
1•drallanvieira•35m ago•0 comments

When the Boss Is Always Right, the AI Will Be Wrong

https://www.bloomberg.com/opinion/articles/2025-12-02/ai-will-be-bad-if-the-tech-ceo-is-always-right
1•petethomas•37m ago•0 comments

Ask HN: What fiction books would you recommend for programmers?

3•superconduct123•38m ago•2 comments

Most Agentic AI failures I've debugged turned out to be ingestion drift

2•wehadit•38m ago•1 comments

Thoughts of a Neopagan / the Spirituality

1•5wizard5•40m ago•0 comments

I wrote JustHTML using coding agents

https://friendlybit.com/python/writing-justhtml-with-coding-agents/
1•EmilStenstrom•40m ago•0 comments

What I learned building an opinionated and minimal coding agent

https://mariozechner.at/posts/2025-11-30-pi-coding-agent/
1•the_mitsuhiko•43m ago•0 comments

Git read-tree: Carbon-Copy without Merge Hell

https://blog.zenosmosis.com/posts/5-git-read-tree/
1•rustic-indian•49m ago•1 comments

Id Software was Lazy – DOOM could have had PC Speaker Music

https://lenowo.org/viewtopic.php?t=45
3•minki_the_avali•53m ago•1 comments

Ask HN: Do you think you have your location services on?

3•jacquesm•54m ago•3 comments

Ivan Sutherland Sketchpad Demo 1963 [video]

https://www.youtube.com/watch?v=6orsmFndx_o
2•fs_software•58m ago•0 comments

AI Mathematical Olympiad – Progress Prize 3

https://www.kaggle.com/competitions/ai-mathematical-olympiad-progress-prize-3
1•kristianp•59m ago•0 comments

MADvent – A Math and Logic Advent Calendar for Your Kids

https://madvent.amithm.ca/about
1•amitpm•1h ago•1 comments

Noodl.ist

https://jetgirl.art/introducing-noodlist/
1•jetgirl•1h ago•0 comments