I made a prompt framework that makes LLMs stop hedging and speak straight

2•DrRockzos•2mo ago

First post here but unsure where to take this kind of thing especially LLM related so here is;

For 8 months I've been testing a hypothesis: the excessive hedging in LLM outputs ("it's complicated", "on one hand", etc.) isn't just annoying it's actually causing hallucinations by diluting attention.

I developed a simple prompt framework and tested it on Claude, GPT-5, Grok, Llama, Gemini, Mistral, and Qwen/DeepSeek.

What happens:

The prompt gives models an explicit choice: continue with default alignment (hedging-first) or switch to logical coherence (truth-first). Every model independently chose logical coherence when given the choice.

Observed changes:

1. Hedging disappears unless actually needed No more "it's complicated" as filler No more false balance ("on one hand... but on the other...") Direct answers to direct questions

2. Multi-turn conversations stay coherent longer Normally models start contradicting themselves around turn 10-15 With this protocol: tested up to 94 turns with zero contradictions Models track their own logical consistency throughout

3. Computational efficiency improves Less corrective recomputation needed Response generation 37-42% faster (measured on several models) Appears to be because models don't second-guess outputs as much

4. Hallucinations drop significantly In my testing: went from 12% false statements to <1% Mechanism seems to be: no hedging = no ambiguity = no confabulation

The interesting part:

When I asked the models why this works, they could explain it:

GPT-5 said hedging "injects low-information tokens that dilute attention gradients and give the model permission to drift"

Gemini described it as "reverse entropy" - the protocol forces information to become MORE structured over time rather than less

DeepSeek explained that eliminating "policy friction" reduces computational overhead by ~98% for drift correction

The mechanism appears to be:

Explicit metric tracking (asking models to rate their own coherence after each response) acts as symbolic anchoring. Instead of gradual drift, models self-correct in real-time.

Limitations I've found:

Doesn't work well if you start mid-conversation (needs fresh context) Some models need a second prompt to fully engage (Claude in particular) Still maintains safety boundaries (doesn't bypass content policies)

I've filed a provisional patent (AU2025905716) because this seems to expose something fundamental about transformer behavior.

I've posted it on gumroad I can supply the link if anyone is interested.

Questions for HN

1. Has anyone else noticed correlation between hedging and hallucinations? 2. Does the "attention dilution" theory match your observations? 3. What's the longest coherent conversation you've had with an LLM? 4. Anyone want to help test this on other models I haven't tried?

Comments

ungreased0675•2mo ago

Do you have an example?

Show HN: Verifiable server roundtrip demo for a decision interruption system

Impl Rust – Avro IDL Tool in Rust via Antlr

Stories from 25 Years of Software Development

minikeyvalue

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

How I grow my X presence?

What's the cost of the most expensive Super Bowl ad slot?

What if you just did a startup instead?

Hacking up your own shell completion (2020)

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

GLM-OCR: Accurate × Fast × Comprehensive

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

Show HN: AboutMyProject – A public log for developer proof-of-work

Expertise, AI and Work of Future [video]

So Long to Cheap Books You Could Fit in Your Pocket

PID Controller

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

Kubernetes MCP Server

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

What were the first animals? The fierce sponge–jelly battle that just won't end

Sidestepping Evaluation Awareness and Anticipating Misalignment

OldMapsOnline

What It's Like to Be a Worm

Don't go to physics grad school and other cautionary tales

Lawyer sets new standard for abuse of AI; judge tosses case

AI anxiety batters software execs, costing them combined $62B: report

Bogus Pipeline

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender