Instead of using RAG, fine-tuning, or external memory, it structures each interaction through a 5-step cycle (Anchor → Analyze → Ground → Reflect → Stabilize) and preserves those anchors across turns.
In early tests, I’ve seen things like:
-A tester redefining c as 100 m/s and the model holding that constant across multi-step reasoning, instead of snapping back to 3×10⁸ m/s
-60–80% reduction in drift over longer conversations (50+ turns), especially around time-sensitive or constraint-sensitive tasks
I’m trying to understand where this fits relative to existing work. It feels like a runtime control layer, not training or RAG, but I don’t want to reinvent something that already exists under a different name.
GitHub (no core code yet, proprietary while I sort out IP): https://github.com/willow-intelligence/willow-demo
Live API demo (simple playground): https://willow-drift-reduction-production.up.railway.app/docs
My questions for HN:
1. Is this kind of temporal anchoring / interaction-layer drift control a known technique under another name? 2. Are there obvious failure modes or prior art I should be looking at? 3. For those of you working with LLMs in production, is an inference-layer drift-reduction wrapper actually useful, or is this just a fancy flavor of prompt engineering?
Honest technical feedback (including “this is nothing new”) is welcome.
If anyone wants to try it and share logs or impressions, I’d be happy to give access and context.
Thanks, Haley