frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Detecting LLM hallucinations in <1ms using hidden states (RTX3050, 4GB)

1•yubainu•1h ago
GitHub: https://github.com/yubainu/sibainu-engine

TL;DR: I built a lightweight auditor that detects hallucinations by monitoring Transformer Hidden State Dynamics in real-time. It achieves 0.90+ ROC-AUC on Gemma/Llama-3.2/Mistral using a single RTX 3050 (4GB), with a core computation time of <1ms.

What it is

The Sibainu Engine is a pre-emptive auditing layer that identifies "latent trajectory collapse"—geometric turbulence in the vector transformations between transformer layers—before the token is even sampled. It requires no training and works with frozen weights.

The "15ms vs 1ms" Latency Reality

I prioritized "no-nonsense" performance reporting. In a local Python/FastAPI environment, the total response time is 15-25ms, but it's important to distinguish the components: Auditing Core (NumPy): < 1.0 ms. The actual vectorized math is near-instant. System Overhead: ~12.0 ms is spent on Pydantic validation and JSON-to-Array conversion.

The Bottom Line: The core logic is significantly faster than the LLM's token generation speed (typically 30-70ms), meaning the audit is theoretically "zero-overhead" if integrated directly into the C++/CUDA inference pipeline.

Key Metrics (Gemma-2B / HaluEval-QA)

ROC-AUC: 0.9176 Recall @ 5% False Signal Rate (FSR): 59.7% (It captures ~60% of hallucinations while only flagging 5% of factual truths). Hardware: Validated on consumer-grade RTX 3050 (4GB) using 4-bit (NF4) quantization.

How it works: Layer Dissonance

Instead of just looking at logit entropy, v6.4 monitors Layer Dissonance—the structural inconsistency between the middle and final layers. When a model hallucinations, the geometric stability between these layers exhibits a specific turbulence that is absent during factual recall.

Closed-Loop Recovery

I’ve included a recovery_agent_gemma.py that demonstrates Autonomous Safety Control. If the engine detects a physical neural anomaly (Score > 3.6510), it immediately aborts the session and triggers a re-generation using deterministic greedy search to stabilize the output.

Comments

yubainu•1h ago
I’ve always been skeptical of the current mainstream approach to hallucination detection—using a larger, more expensive LLM to "fact-check" a smaller one after the fact. To me, this felt like an inefficient recursive loop that doesn't solve the root cause.

When a human lies, the truth often reveals itself not in their words, but in their "tells"—a subtle change in facial expression or a shift in tone. I theorized that LLMs might exhibit similar "neural tells." When a model starts to hallucinate, there should be a detectable anomaly in the Hidden State Dynamics before the token is even sampled.

This led me to develop the Sibainu Engine.

My goal was to build a pre-emptive auditing layer that runs on consumer-grade hardware (RTX 3050 4GB). By monitoring the geometric stability (which I call "Layer Dissonance") between transformer layers in real-time, the engine identifies the "collapse of latent trajectory" with a core latency of less than 1ms.

Key Technical Highlights:

Efficiency: ROC-AUC > 0.90 across Gemma, Llama-3.2, and Mistral, without any additional training or fine-tuning.

Low Overhead: While the Python API adds some serialization delay, the vectorized NumPy core is fast enough to be integrated directly into any inference pipeline without bottlenecking generation.

Autonomous Recovery: I've included a demo where the engine aborts a "corrupted" session and triggers a deterministic re-generation the moment a physical neural anomaly is detected.

I believe that for LLM safety to be truly scalable, it needs to be lightweight and deterministic. I’m curious to hear your thoughts on this geometric approach and its potential generalizability to larger architectures.

The Singularity Will Not Be Streamed

https://paoramen.fika.bar/the-singularity-will-not-be-streamed-01KJ7KM42KET7EZQQSYD836358
1•masylum•9s ago•0 comments

Sigwork – A 1.7kb signal-based reactive framework

https://framework.thatjust.works/
1•murillobrand•49s ago•1 comments

I love my dumb watches

https://gary.onl/a-post-about-watches/
1•abnercoimbre•1m ago•0 comments

Show HN: Antfly: Distributed, Multimodal Search and Memory and Graphs in Go

https://github.com/antflydb/antfly
2•kingcauchy•1m ago•0 comments

Show HN: I indexed 58K AI agents and built trust scores for the agent economy

https://nanosec.ai
1•bobakTamaddon•2m ago•0 comments

I think AI is pushing me toward the AGPL

https://blogsystem5.substack.com/p/ai-and-agpl-licensing
1•LorenDB•3m ago•0 comments

A.B. 1043's Internet Age Gates Hurt Everyone – Eff.org

https://www.eff.org/deeplinks/2026/03/ab-1043s-internet-age-gates-hurt-everyone
1•netule•4m ago•0 comments

Math in the AI Era

https://3quarksdaily.com/3quarksdaily/2026/03/math-in-the-ai-era.html
1•thm•4m ago•0 comments

Turkish Coffee? Since the 16th Century, It's in the Water

https://specialprojects.sprudge.com/?p=868
1•speckx•5m ago•0 comments

TV Learned to Sell Itself

https://worksinprogress.co/issue/how-tv-learned-to-sell-itself/
1•ortegaygasset•5m ago•0 comments

Where chess knight can't go

https://claude.ai/public/artifacts/35a91fb5-89d6-4545-9682-50bc732f8e9c
1•oyster143•6m ago•0 comments

Show HN: Touchenv – store ENV master keys in macOS keychain

https://github.com/tillcarlos/touchenv
1•tillcarlos•6m ago•0 comments

Finance Bros to Tech Bros: Don't Mess with My Bloomberg Terminal

https://www.wsj.com/tech/ai/bloomberg-terminal-perplexity-vibe-coding-e37a95f8
1•1vuio0pswjnm7•7m ago•0 comments

Show HN: Cardea, SSH bastion with per-key ACLs, TPM keys and session recording

https://github.com/hectorm/cardea
1•hectorm•7m ago•0 comments

A Stock Monitoring Agent with OpenClaw, Exa, and Milvus for $20/Month

https://milvus.io/blog/i-built-a-stock-monitoring-agent-with-openclaw-exa-and-milvus-for-20month.md
2•Fendy•8m ago•0 comments

Change Blindness

https://en.wikipedia.org/wiki/Change_blindness
1•georgecmu•8m ago•0 comments

Greywall – Sandboxing with real-time observability and dynamic controls

https://greywall.io/
1•artavak•9m ago•0 comments

Show HN: Uclusion – We spent 7 years building an opinionated planning tool

https://www.uclusion.com
2•disrael•10m ago•0 comments

First Ripoff Hearing Packed by Renters Eager to Connect Dots on Lousy Landlords

https://www.thecity.nyc/2026/02/26/rental-ripoff-hearing-tenants-landlords-mamdani/
1•PaulHoule•10m ago•0 comments

No imminent threat": U.S. Counterterrorism Center head resigns over Iran war

https://www.axios.com/2026/03/17/joe-kent-resigns-trump-iran-israel-threat
2•Jimmc414•10m ago•1 comments

Debt investors offloading exposure to software companies is latest sign of pain

https://www.reuters.com/business/finance/debt-investors-offloading-exposure-software-companies-is...
1•1vuio0pswjnm7•11m ago•0 comments

1Password Unified Access

https://1password.com/product/unified-access
1•jeffpalmer•11m ago•0 comments

deleted

https://github.com/ory/lumen
1•pistolpete5•12m ago•0 comments

Top counterterrorism official Joe Kent resigns over Trump's Iran war – AP News

https://apnews.com/article/trump-iran-war-kent-resignation-e2e17a76d79617a68370f076c0291208
9•treetalker•12m ago•0 comments

U.S. pushes WTO to make the e-commerce tariff moratorium permanent

https://www.latimes.com/business/story/2026-03-17/u-s-pushes-wto-to-make-e-commerce-tariff-morato...
1•1vuio0pswjnm7•13m ago•0 comments

The Economics of Slop

https://javiergonzalez.io/blog/the-economics-of-slop/
1•javier123454321•13m ago•0 comments

A Fuzzer for the Toy Optimizer

https://bernsteinbear.com/blog/toy-fuzzer/
1•surprisetalk•14m ago•0 comments

When the Robots Take Your Job (2024)

https://www.newthingsunderthesun.com/pub/4bnobp5q/release/4
1•surprisetalk•14m ago•0 comments

AI Minus Vagina

https://taylor.town/ai-minus-vagina
1•surprisetalk•14m ago•0 comments

Ordered Dithering with Arbitrary or Irregular Colour Palettes (2023)

https://matejlou.blog/2023/12/06/ordered-dithering-for-arbitrary-or-irregular-palettes/
2•surprisetalk•14m ago•0 comments