For 8 months I've been testing a hypothesis: the excessive hedging in LLM outputs ("it's complicated", "on one hand", etc.) isn't just annoying it's actually causing hallucinations by diluting attention.
I developed a simple prompt framework and tested it on Claude, GPT-5, Grok, Llama, Gemini, Mistral, and Qwen/DeepSeek.
What happens:
The prompt gives models an explicit choice: continue with default alignment (hedging-first) or switch to logical coherence (truth-first). Every model independently chose logical coherence when given the choice.
Observed changes:
1. Hedging disappears unless actually needed No more "it's complicated" as filler No more false balance ("on one hand... but on the other...") Direct answers to direct questions
2. Multi-turn conversations stay coherent longer Normally models start contradicting themselves around turn 10-15 With this protocol: tested up to 94 turns with zero contradictions Models track their own logical consistency throughout
3. Computational efficiency improves Less corrective recomputation needed Response generation 37-42% faster (measured on several models) Appears to be because models don't second-guess outputs as much
4. Hallucinations drop significantly In my testing: went from 12% false statements to <1% Mechanism seems to be: no hedging = no ambiguity = no confabulation
The interesting part:
When I asked the models why this works, they could explain it:
GPT-5 said hedging "injects low-information tokens that dilute attention gradients and give the model permission to drift"
Gemini described it as "reverse entropy" - the protocol forces information to become MORE structured over time rather than less
DeepSeek explained that eliminating "policy friction" reduces computational overhead by ~98% for drift correction
The mechanism appears to be:
Explicit metric tracking (asking models to rate their own coherence after each response) acts as symbolic anchoring. Instead of gradual drift, models self-correct in real-time.
Limitations I've found:
Doesn't work well if you start mid-conversation (needs fresh context) Some models need a second prompt to fully engage (Claude in particular) Still maintains safety boundaries (doesn't bypass content policies)
I've filed a provisional patent (AU2025905716) because this seems to expose something fundamental about transformer behavior.
I've posted it on gumroad I can supply the link if anyone is interested.
Questions for HN
1. Has anyone else noticed correlation between hedging and hallucinations? 2. Does the "attention dilution" theory match your observations? 3. What's the longest coherent conversation you've had with an LLM? 4. Anyone want to help test this on other models I haven't tried?
ungreased0675•22m ago