Formal Proof: LLM Hallucinations Are Structural, Not Statistical (Coq Verified)

2•ICBTheory•2mo ago

Comments

ICBTheory•2mo ago

Author here.

This paper is Part III of a trilogy investigating the limits of algorithmic cognition. Given the recent industry signals regarding "scaling plateaus" (e.g., Sutskever etc.), I attempt to formalize why these limits appear structurally unavoidable.

The Thesis: We model modern AI as a Probabilistic Bounded Semantic System (P-BoSS). The paper demonstrates via the "Inference Trilemma" that hallucinations are not transient bugs to be fixed by more data, but mathematical necessities when a bounded system faces fat-tailed domains (alpha ≤ 1).

The Proof: While this paper focuses on the CS implications, the underlying mathematical theorems (Rice’s Theorem applied to Semantic Frames, Sheaf Theoretic Gluing Failures) are formally verified using Coq.

You can find the formal proofs and the Coq code in the companion paper (Part II) here:

https://philpapers.org/rec/SCHTIC-16

I’m happy to discuss the P-BOSS definition and why probabilistic mitigation fails in divergent entropy regimes.

wiz21c•2mo ago

Since we can't avoid hallucinations, maybe we can live with them ?

I mean, I regularly use LLM's and although, sometimes, they go a bit mad, most of the time they're really helpful

ICBTheory•2mo ago

I'd say that conclusion is a manifestation of pragmatic wisdom.

Anyway: I agree. The paper certainly doesn't argue that AI is useless, but that autonomy in high-stakes domains is mathematically unsafe.

In the text, I distinguish between operating on an 'Island of Order' (where hallucinations are cheap and correctable, like fixing a syntax error in code) versus navigating the 'Fat-Tailed Ocean' (where a single error is irreversible).

Tying this back to your comment: If an AI hallucinates a variable name — no problem, you just fix it. But I would advise skepticism if an AI suggests telling your boss that 'his professional expertise still has significant room for improvement.'

If hallucinations are structural (as the Coq proof in Part II indicates), then 'living with them' means ensuring the system never has the autonomy to execute that second type of decision.

Was going to share my work

Pitchfork: A devilishly good process manager for developers

You Are Here

Why social apps need to become proactive, not reactive

How patient are AI scrapers, anyway? – Random Thoughts

Vouch: A contributor trust management system

I built a terminal monitoring app and custom firmware for a clock with Claude

Tiny C Compiler

Y Combinator Founder Organizes 'March for Billionaires'

Ask HN: Need feedback on the idea I'm working on

OpenClaw Addresses Security Risks

Apple finalizes Gemini / Siri deal

Italy Railways Sabotaged

Emacs-tramp-RPC: high-performance TRAMP back end using MsgPack-RPC

Nintendo Wii Themed Portfolio

"There must be something like the opposite of suicide "

Ask HN: Why doesn't Netflix add a “Theater Mode” that recreates the worst parts?

Show HN: Engineering Perception with Combinatorial Memetics

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

The Anthropic Hive Mind

Just Started Using AmpCode

LLM as an Engineer vs. a Founder?

Crosstalk inside cells helps pathogens evade drugs, study finds

Show HN: Design system generator (mood to CSS in <1 second)

Show HN: 26/02/26 – 5 songs in a day

Toroidal Logit Bias – Reduce LLM hallucinations 40% with no fine-tuning

Top AI models fail at >96% of tasks

The Science of the Perfect Second (2023)

Bob Beck (OpenBSD) on why vi should stay vi (2006)

Show HN: a glimpse into the future of eye tracking for multi-agent use