frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: What is the best way to provide continuous context to models?

24•nemath•4h ago
With research done till date, what according to you is the best way to provide context to a model. Are there any articles that go into depth of how Cursor does it?

How do context collation companies work?

Comments

dtagames•3h ago
There is no such thing as continuous context. There is only context that you start and stop, which is the same as typing those words in the prompt. To make anything carry over to a second thread, it must be included in the second thread's context.

Rules are just context, too, and all elaborate AI control systems boil down to these contexts and tool calls.

In other words, you can rig it up anyway you like. Only the context in the actual thread (or "continuation," as it used to be called) is sent to the model, which has no memory or context outside that prompt.

tcdent•1h ago
Furthermore, all of the major LLM APIs reward you for re-sending the same context with only appended data in the form of lower token costs (caching).

There may be a day when we retroactively edit context, but the system in it's current state is not very supportive of that.

swid•1h ago
If you know you will be pruning or otherwise reusing the context across multiple threads, the best place for context that will be retained is at the beginning due to prompt caching - it will reduce the cost and improve the speed.

If not, inserting new context any place other than at the end will cause cache misses and therefore slow down the response and increase cost.

Models also have some bias for tokens at start and end of the context window, so potentially there is a reason to put important instructions in one of those places.

catlifeonmars•1h ago
I wonder how far you can take that. Basically can you jam a bunch of garbage in the middle and still get useful results
vivekraja•51m ago
I think the emerging best way is to do "agentic search" over files. If you think about it, Claude Code is quite good at navigating large codebases and finding the required context for a problem.

Further, instead of polluting the context of your main agent, you can run a subagent to do search and retrieve the important bits of information and report back to your main agent. This is what Claude Code does if you use the keyword "explore". It starts a subagent with Haiku which reads ten of thousands of tokens in seconds.

From my experience the only shortcoming of this approach right now is that it's slow, and sometimes haiku misses some details in what it reads. These will get better very soon (in one or two generations, we will likely see opus 4.5 level intelligence at haiku speeds/price). For now, if not missing a detail is important for your usecase, you can give the output from the first subagent to a second one and ask the second one to find important details the first one missed. I've found this additional step to catch most things the first search missed. You can try this for yourself with Claude Code: ask it to create a plan for your spec, and then pass the plan to a second Claude Code session and ask it to find gaps and missing files from the plan.

nl•43m ago
Gemini 3 Flash is very good at the search task (it benchmarks quite close to 3 Pro in coding tasks but is much faster). I believe Amp switch to Gemini Flash for their search agent because it is better.
bluegatty•43m ago
Every time you send a request to a model you're already providing all of the context history along with it. To edit the context, just send a different context history. You can send whatever you want as history, it's entirely up to you and entirely arbitrary.

We only think in conversational turns because that's what we've expected a conversation to 'look like'. But that's just a very deeply ingrained convention.

Forget that there is such a thing as 'turns' in a LLM convo for now, imagine that it's all 'one-shot'.

So you ask A, it responds A1.

But when you and B, and expect B1 - which depends on A and A1 already being in the convo history - consider that you are actually sending that again anyhow.

Behind the scenes when you think you're sending just 'B' (next prompt) you're actually sending A + A1 + B aka including the history.

A and A1 are usually 'cached' but that's not the simplest way to do it, the caching is an optimization.

Without caching the model would just process all of A + A1 + B and B1 in return just the same.

And then A + A1 + B + B1 + C and expect C1 in return.

It just so happens it will cache the state of the convo at your previous turn, and so it's optimized but the key insight is that you can send whatever context you want at any time.

If after you send A + A1 + B + B1 + C and get C1, if you want to then send A + B + C + D and expect D1 ... (basically sending the prompts with no responses) - you can totally do that. It will have to re-process all of that aka no cached state, but it will definitely do it for you.

Heck you can send Z + A + X, or A + A1 + X + Y - or whatever you want.

So in that sense - what you are really sending (if you're using the simplest form API), is sending 'a bunch of content' and 'expecting a response'. That's it. Everything is actually 'one shot' (prefill => response) and that's it. It feels conversational but structural and operational convention.

So the very simple answer to your question is: send whatever context you want. That's it.

_boffin_•28m ago
> what according to you is the best way to provide context to a model.

Are you talking about manually or in an automated fashion?

Agent_Builder•25m ago
We ran into this while building GTWY.ai. What worked for us wasn’t trying to keep a single model “continuously informed”, but breaking work into smaller steps with explicit context passed between them. Long-lived context drifted fast. Short-lived, well-scoped context stayed predictable.
zarathustra333•12m ago
I've been building https://www.usesatori.sh/ to give persistent context to agents

Would be happy to onboard you personally.

The URL shortener that makes your links look as suspicious as possible

https://creepylink.com/
95•dreadsword•2h ago•23 comments

Claude Cowork exfiltrates files

https://www.promptarmor.com/resources/claude-cowork-exfiltrates-files
560•takira•9h ago•235 comments

Furiosa: 3.5x efficiency over H100s

https://furiosa.ai/blog/introducing-rngd-server-efficient-ai-inference-at-data-center-scale
117•written-beyond•4h ago•60 comments

Ask HN: What is the best way to provide continuous context to models?

28•nemath•4h ago•10 comments

Scaling long-running autonomous coding

https://cursor.com/blog/scaling-agents
159•samwillis•7h ago•78 comments

You need a kitchen slide rule

https://entropicthoughts.com/kitchen-slide-rule
34•aebtebeten•1d ago•36 comments

Ask HN: Share your personal website

489•susam•12h ago•1451 comments

Show HN: Sparrow-1 – Audio-native model for human-level turn-taking without ASR

https://www.tavus.io/post/sparrow-1-human-level-conversational-timing-in-real-time-voice
13•code_brian•11h ago•2 comments

Ask HN: What did you find out or explore today?

26•blahaj•11h ago•14 comments

Ask HN: Weird archive.today behavior?

40•rabinovich•6h ago•14 comments

New Safari developer tools provide insight into CSS Grid Lanes

https://webkit.org/blog/17746/new-safari-developer-tools-provide-insight-into-css-grid-lanes/
11•feross•4h ago•2 comments

Bubblewrap: A nimble way to prevent agents from accessing your .env files

https://patrickmccanna.net/a-better-way-to-limit-claude-code-and-other-coding-agents-access-to-se...
49•0o_MrPatrick_o0•3h ago•41 comments

The State of OpenSSL for pyca/cryptography

https://cryptography.io/en/latest/statements/state-of-openssl/
103•SGran•7h ago•18 comments

Project SkyWatch (a.k.a. Wescam at Home)

https://ianservin.com/2026/01/13/project-skywatch-aka-wescam-at-home/
4•jjwiseman•12h ago•1 comments

Ask HN: How are you doing RAG locally?

44•tmaly•14h ago•14 comments

Why some clothes shrink in the wash and how to unshrink them

https://www.swinburne.edu.au/news/2025/08/why-some-clothes-shrink-in-the-wash-and-how-to-unshrink...
475•OptionOfT•4d ago•249 comments

Show HN: WebTiles – create a tiny 250x250 website with neighbors around you

https://webtiles.kicya.net/
148•dimden•5d ago•22 comments

First impressions of Claude Cowork

https://simonw.substack.com/p/first-impressions-of-claude-cowork
9•stosssik•23h ago•0 comments

Eigent: An open source Claude Cowork alternative

https://github.com/eigent-ai/eigent
7•WorldPeas•12h ago•2 comments

Show HN: Webctl – Browser automation for agents based on CLI instead of MCP

https://github.com/cosinusalpha/webctl
78•cosinusalpha•14h ago•23 comments

SparkFun Officially Dropping AdaFruit due to CoC Violation

https://www.sparkfun.com/official-response
425•yaleman•14h ago•424 comments

Sun Position Calculator

https://drajmarsh.bitbucket.io/earthsun.html
83•sanbor•8h ago•18 comments

ChromaDB Explorer

https://www.chroma-explorer.com/
47•arsentjev•6h ago•2 comments

Generate QR Codes with Pure SQL in PostgreSQL

https://tanelpoder.com/posts/generate-qr-code-with-pure-sql-in-postgres/
67•tanelpoder•4d ago•5 comments

Find a pub that needs you

https://www.ismypubfucked.com/
239•thinkingemote•13h ago•193 comments

How can I build a simple pulse generator to demonstrate transmission lines

https://electronics.stackexchange.com/questions/764155/how-can-i-build-a-simple-pulse-generator-t...
30•alphabetter•5d ago•6 comments

I Designed a Custom Protocol for My App

https://blog.roj.dev/how-i-designed-a-custom-protocol-for-my-app
4•_roj•2d ago•1 comments

Roam 50GB is now Roam 100GB

https://starlink.com/support/article/58c9c8b7-474e-246f-7e3c-06db3221d34d
264•bahmboo•13h ago•310 comments

Crafting Interpreters

https://craftinginterpreters.com/
52•tosh•7h ago•8 comments

Ask HN: Distributed SQL engine for ultra-wide tables

10•synsqlbythesea•7h ago•6 comments