frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Context Rot: How increasing input tokens impacts LLM performance

https://research.trychroma.com/context-rot
74•kellyhongsn•5h ago
I work on research at Chroma, and I just published our latest technical report on context rot.

TLDR: Model performance is non-uniform across context lengths, including state-of-the-art GPT-4.1, Claude 4, Gemini 2.5, and Qwen3 models.

This highlights the need for context engineering. Whether relevant information is present in a model’s context is not all that matters; what matters more is how that information is presented.

Here is the complete open-source codebase to replicate our results: https://github.com/chroma-core/context-rot

Comments

tjkrusinski•3h ago
Interesting report. Are there recommended sizes for different models? How do I know what works or doesn't for my use case?
posnet•2h ago
I've definitely noticed this anecdotally.

Especially with Gemini Pro when providing long form textual references, providing many documents in a single context windows gives worse answers than having it summarize documents first, ask a question about the summary only, then provide the full text of the sub-documents on request (rag style or just simple agent loop).

Similarly I've personally noticed that Claude Code with Opus or Sonnet gets worse the more compactions happen, it's unclear to me whether it's just the summary gets worse, or if its the context window having a higher percentage of less relevant data, but even clearing the context and asking it to re-read the relevant files (even if they were mentioned and summarized in the compaction) gives better results.

tough•2h ago
Have you tried NotebookLM which basically does this as an app on the bg (chunking and summarising many docs) and you can -chat- with the full corpus using RAG
zwaps•2h ago
Gemini loses coherence and reasoning ability well before the chat hits the context limitations, and according to this report, it is the best model on several dimensions.

Long story short: Context engineering is still king, RAG is not dead

risyachka•1h ago
Yep. The easiest way to tell someone has no experience with LLMs is if they say “RAG is dead”
apwell23•31m ago
> someone has no experience with LLMs

Thats 99% of coders. No need to gatekeep.

tvshtr•59m ago
Yep, it can decohere really badly with bigger context. It's not only context related though. Sometimes it can lose focus early on in a way that is impossible to get it back on track.
deadbabe•31m ago
RAG was never going away, the people who say that are the same types who say software engineers will be totally replaced with AI.

LLMs will need RAG one way or another, you can hide it from the user, but it still must be there.

zwaps•2h ago
Very cool results, very comprehensive article, many insights!

Media literacy disclaimer: Chroma is a vectorDB company.

philip1209•1h ago
Chroma does vector, full-text, and regex search. And, it's designed for multitenant workloads typical of AI applications. So, not just a "vectorDB company"
tough•2h ago
this felt intuitively true, great to see some research putting hard numbers on that
lukev•1h ago
This effect is well known but not well documented so far, so great job here.

It's actually even more significant than it's possible to benchmark easily (though I'm glad this paper has done so.)

Truly useful LLM applications live at the boundaries of what the model can do. That is, attending to some aspect of the context that might be several logical "hops" away from the actual question or task.

I suspect that the context rot problem gets much worse for these more complex tasks... in fact, exponentially so for each logical "hop" which is required to answer successfully. Each hop compounds the "attention difficulty" which is increased by long/distracting contexts.

LIGO detects most massive black hole merger to date

https://www.caltech.edu/about/news/ligo-detects-most-massive-black-hole-merger-to-date
165•Eduard•4h ago•74 comments

Apple's MLX adding CUDA support

https://github.com/ml-explore/mlx/pull/1983
113•nsagent•3h ago•46 comments

RFC: PHP license update

https://wiki.php.net/rfc/php_license_update
119•josephwegner•3h ago•30 comments

DEWLine Museum – The Distant Early Warning Radar Line

https://dewlinemuseum.com/
24•reaperducer•2h ago•1 comments

Kiro: A new agentic IDE

https://kiro.dev/blog/introducing-kiro/
668•QuinnyPig•10h ago•299 comments

NeuralOS: An operating system powered by neural networks

https://neural-os.com/
75•yuntian•4h ago•23 comments

Dog Walk: Blender Studio's official game project

https://blenderstudio.itch.io/dogwalk
37•doener•3h ago•8 comments

Context Rot: How increasing input tokens impacts LLM performance

https://research.trychroma.com/context-rot
75•kellyhongsn•5h ago•12 comments

Show HN: Bedrock – An 8-bit computing system for running programs anywhere

https://benbridle.com/projects/bedrock.html
59•benbridle•4d ago•14 comments

Replicube: 3D shader puzzle game, online demo

https://replicube.xyz/staging/
82•inktype•3d ago•16 comments

I Solved the Century-Old Mystery of a Miraculous Shipwreck Survivor

https://thewalrus.ca/empress-of-ireland-survivor-mystery/
5•Thevet•2d ago•0 comments

Cognition (Devin AI) to Acquire Windsurf

https://cognition.ai/blog/windsurf
346•alazsengul•6h ago•271 comments

Anthropic, Google, OpenAI and XAI Granted Up to $200M from Defense Department

https://www.cnbc.com/2025/07/14/anthropic-google-openai-xai-granted-up-to-200-million-from-dod.html
102•ChrisArchitect•3h ago•74 comments

SQLite async connection pool for high-performance

https://github.com/slaily/aiosqlitepool
44•slaily•3d ago•20 comments

Building Modular Rails Applications: A Deep Dive into Rails Engines

https://www.panasiti.me/blog/modular-rails-applications-rails-engines-active-storage-dashboard/
121•giovapanasiti•9h ago•26 comments

Cidco MailStation as a Z80 Development Platform (2019)

https://jcs.org/2019/05/03/mailstation
43•robin_reala•6h ago•4 comments

Embedding user-defined indexes in Apache Parquet

https://datafusion.apache.org/blog/2025/07/14/user-defined-parquet-indexes/
91•jasim•8h ago•13 comments

Strategies for Fast Lexers

https://xnacly.me/posts/2025/fast-lexer-strategies/
122•xnacly•10h ago•43 comments

Show HN: The HTML Maze – Escape an eerie labyrinth built with HTML pages

https://htmlmaze.com/
25•kyrylo•4h ago•5 comments

Japanese grandparents create life-size Totoro with bus stop for grandkids (2020)

https://mymodernmet.com/totoro-sculpture-bus-stop/
236•NaOH•8h ago•58 comments

Meticulous (YC S21) is hiring in UK to redefine software dev

https://tinyurl.com/join-meticulous
1•Gabriel_h•7h ago

Lightning Detector Circuits

https://techlib.com/electronics/lightningnew.htm
71•nateb2022•9h ago•35 comments

Predicting Competitive Pokémon VGC Leads Using Latent Semantic Analysis

https://jgeekstudies.org/2025/07/11/predicting-competitive-pokemon-vgc-leads-using-latent-semantic-analysis-a-data-driven-approach-to-team-matchups/
7•zdw•2d ago•1 comments

East Asian aerosol cleanup has likely contributed to global warming

https://www.nature.com/articles/s43247-025-02527-3
153•defrost•15h ago•164 comments

Tandy Corporation, Part 3 Becoming IBM Compatible

https://www.abortretry.fail/p/tandy-corporation-part-3
54•klelatti•3d ago•14 comments

Two guys hated using Comcast, so they built their own fiber ISP

https://arstechnica.com/tech-policy/2025/07/two-guys-hated-using-comcast-so-they-built-their-own-fiber-isp/
277•LorenDB•9h ago•182 comments

Data brokers are selling flight information to CBP and ICE

https://www.eff.org/deeplinks/2025/07/data-brokers-are-selling-your-flight-information-cbp-and-ice
409•exiguus•8h ago•197 comments

The Corset X-Rays of Dr Ludovic O'Followell (1908)

https://publicdomainreview.org/collection/the-corset-x-rays-of-dr-ludovic-o-followell-1908/
29•healsdata•3d ago•1 comments

Impacts of adding PV solar system to internal combustion engine vehicles

https://www.jstor.org/stable/26169128
100•red369•13h ago•214 comments

It took 45 years, but spreadsheet legend Mitch Kapor finally got his MIT degree

https://www.bostonglobe.com/2025/06/24/business/mitch-kapor-mit-degree-bill-aulet/
163•bookofjoe•3d ago•16 comments