Show HN: Detecting LLM hallucinations in <1ms using hidden states (RTX3050, 4GB)

1•yubainu•1h ago

GitHub: https://github.com/yubainu/sibainu-engine

TL;DR: I built a lightweight auditor that detects hallucinations by monitoring Transformer Hidden State Dynamics in real-time. It achieves 0.90+ ROC-AUC on Gemma/Llama-3.2/Mistral using a single RTX 3050 (4GB), with a core computation time of <1ms.

What it is

The Sibainu Engine is a pre-emptive auditing layer that identifies "latent trajectory collapse"—geometric turbulence in the vector transformations between transformer layers—before the token is even sampled. It requires no training and works with frozen weights.

The "15ms vs 1ms" Latency Reality

I prioritized "no-nonsense" performance reporting. In a local Python/FastAPI environment, the total response time is 15-25ms, but it's important to distinguish the components: Auditing Core (NumPy): < 1.0 ms. The actual vectorized math is near-instant. System Overhead: ~12.0 ms is spent on Pydantic validation and JSON-to-Array conversion.

The Bottom Line: The core logic is significantly faster than the LLM's token generation speed (typically 30-70ms), meaning the audit is theoretically "zero-overhead" if integrated directly into the C++/CUDA inference pipeline.

Key Metrics (Gemma-2B / HaluEval-QA)

ROC-AUC: 0.9176 Recall @ 5% False Signal Rate (FSR): 59.7% (It captures ~60% of hallucinations while only flagging 5% of factual truths). Hardware: Validated on consumer-grade RTX 3050 (4GB) using 4-bit (NF4) quantization.

How it works: Layer Dissonance

Instead of just looking at logit entropy, v6.4 monitors Layer Dissonance—the structural inconsistency between the middle and final layers. When a model hallucinations, the geometric stability between these layers exhibits a specific turbulence that is absent during factual recall.

Closed-Loop Recovery

I’ve included a recovery_agent_gemma.py that demonstrates Autonomous Safety Control. If the engine detects a physical neural anomaly (Score > 3.6510), it immediately aborts the session and triggers a re-generation using deterministic greedy search to stabilize the output.

Comments

yubainu•1h ago

I’ve always been skeptical of the current mainstream approach to hallucination detection—using a larger, more expensive LLM to "fact-check" a smaller one after the fact. To me, this felt like an inefficient recursive loop that doesn't solve the root cause.

When a human lies, the truth often reveals itself not in their words, but in their "tells"—a subtle change in facial expression or a shift in tone. I theorized that LLMs might exhibit similar "neural tells." When a model starts to hallucinate, there should be a detectable anomaly in the Hidden State Dynamics before the token is even sampled.

This led me to develop the Sibainu Engine.

My goal was to build a pre-emptive auditing layer that runs on consumer-grade hardware (RTX 3050 4GB). By monitoring the geometric stability (which I call "Layer Dissonance") between transformer layers in real-time, the engine identifies the "collapse of latent trajectory" with a core latency of less than 1ms.

Key Technical Highlights:

Efficiency: ROC-AUC > 0.90 across Gemma, Llama-3.2, and Mistral, without any additional training or fine-tuning.

Low Overhead: While the Python API adds some serialization delay, the vectorized NumPy core is fast enough to be integrated directly into any inference pipeline without bottlenecking generation.

Autonomous Recovery: I've included a demo where the engine aborts a "corrupted" session and triggers a deterministic re-generation the moment a physical neural anomaly is detected.

I believe that for LLM safety to be truly scalable, it needs to be lightweight and deterministic. I’m curious to hear your thoughts on this geometric approach and its potential generalizability to larger architectures.

The Singularity Will Not Be Streamed

Sigwork – A 1.7kb signal-based reactive framework

I love my dumb watches

Show HN: Antfly: Distributed, Multimodal Search and Memory and Graphs in Go

Show HN: I indexed 58K AI agents and built trust scores for the agent economy

I think AI is pushing me toward the AGPL

A.B. 1043's Internet Age Gates Hurt Everyone – Eff.org

Math in the AI Era

Turkish Coffee? Since the 16th Century, It's in the Water

TV Learned to Sell Itself

Where chess knight can't go

Show HN: Touchenv – store ENV master keys in macOS keychain

Finance Bros to Tech Bros: Don't Mess with My Bloomberg Terminal

Show HN: Cardea, SSH bastion with per-key ACLs, TPM keys and session recording

A Stock Monitoring Agent with OpenClaw, Exa, and Milvus for $20/Month

Change Blindness

Greywall – Sandboxing with real-time observability and dynamic controls

Show HN: Uclusion – We spent 7 years building an opinionated planning tool

First Ripoff Hearing Packed by Renters Eager to Connect Dots on Lousy Landlords

No imminent threat": U.S. Counterterrorism Center head resigns over Iran war

Debt investors offloading exposure to software companies is latest sign of pain

1Password Unified Access

deleted

Top counterterrorism official Joe Kent resigns over Trump's Iran war – AP News

U.S. pushes WTO to make the e-commerce tariff moratorium permanent

The Economics of Slop

A Fuzzer for the Toy Optimizer

When the Robots Take Your Job (2024)

AI Minus Vagina

Ordered Dithering with Arbitrary or Irregular Colour Palettes (2023)