How Long Contexts Fail

https://www.dbreunig.com/2025/06/22/how-contexts-fail-and-how-to-fix-them.html

10•ulrischa•7mo ago

Comments

rorylaitila•7mo ago

There seems to be a need to be some kind of hierarchy to contextualization brought to LLMs and not a flat history stuffed into the context. For comparison, human memory is not just a compressing of linear history of which we rewind the tape when needed. Its more like a Russian doll of bounded contexts which we break out of, when our inner context is no longer sufficient to solve the problem.

fennecbutt•7mo ago

I mean that's pretty much self attention mechanism right?

It's just beyond playing with more heads, specialised heads, kv caching etc it doesn't seem like anybody's figured out the next step here yet.

Attention is already pretty atrocious perf even with caching so additional context metadata would have to be implemented carefully.

fennecbutt•7mo ago

I find this topic very interesting because it's something I've run into and mitigated ever since gpt3 was available.

Plenty of long, loooong and complex role plays, world building and tests to see if I could integrate dozens of different local models into a game project or similar.

All of the same issues there apply here for "agents" as well.

Very quickly learn that even current models are like distracted puppies. Larger models seem to be able to brute force their way through some of these problems but I wouldn't call that sustainable.

mberlove•7mo ago

Sometimes it seems you can "remind" the more established models, and this will bring the context back into focus (just from personal experience) but why that would work, I can only guess.

What methods have you found to brute-force through the problem?

aitchnyu•7mo ago

Are there published numbers or benchmarks that tells model X is brain fried at Y tokens?

Velocity of Money

Stop building automations. Start running your business

You can't QA your way to the frontier

Show HN: PalettePoint – AI color palette generator from text or images

Robust and Interactable World Models in Computer Vision [video]

Nestlé couldn't crack Japan's coffee market.Then they hired a child psychologist

Notes for February 2-7

Study confirms experience beats youthful enthusiasm

The Big Hunger by Walter J Miller, Jr. (1952)

The Genus Amanita

We have broken SHA-1 in practice

Ask HN: Was my first management job bad, or is this what management is like?

Ask HN: How to Reduce Time Spent Crimping?

KV Cache Transform Coding for Compact Storage in LLM Inference

A quantitative, multimodal wearable bioelectronic device for stress assessment

Why Big Tech Is Throwing Cash into India in Quest for AI Supremacy

How to shoot yourself in the foot – 2026 edition

Eight More Months of Agents

From Human Thought to Machine Coordination

The new X API pricing must be a joke

Show HN: RMA Dashboard fast SAST results for monorepos (SARIF and triage)

Show HN: Source code graphRAG for Java/Kotlin development based on jQAssistant

Python Only Has One Real Competitor

Tmux to Zellij (and Back)

Ask HN: How are you using specialized agents to accelerate your work?

Passing user_id through 6 services? OTel Baggage fixes this

DavMail Pop/IMAP/SMTP/Caldav/Carddav/LDAP Exchange Gateway

Visual data modelling in the browser (open source)

Show HN: Tharos – CLI to find and autofix security bugs using local LLMs

Oddly Simple GUI Programs