Show HN: SIB-ENGINE Pre-emptive hallucination detection via geometric structure

https://github.com/yubainu/sibainu-engine

1•yubainu•1h ago

I built SIB-ENGINE, a real-time hallucination detection system that monitors LLM internal structure rather than output content.

KEY RESULTS (Gemma-2B, N=1000):

• 54% hallucination detection with 7% false positive rate

• <1% computational overhead (runs on RTX 3050 with 4GB VRAM)

• ROC-AUC: 0.8995

WHY IT'S DIFFERENT:

Traditional methods analyze the output text semantically.

SIB-ENGINE monitors "geometric drift" in hidden states during generation - identifying the structural collapse of the latent space before the first incorrect token is sampled.

This approach offers unique advantages:

• Real-time intervention: Stop generation mid-stream

• Language-agnostic: No semantic analysis needed

• Privacy-preserving: Never reads the actual content

• Extremely lightweight: Works on consumer hardware

HOW IT WORKS: SIB-ENGINE monitors the internal stability of the model's computation. While the system utilizes multiple structural signals to detect instability, two primary indicators include:

Representation Stability: Tracking how the initial intent is preserved or distorted as it moves through the model's transformation space.

Cross-Layer Alignment: Monitoring the consensus of information processing across different neural depths to identify early-stage divergence.

When these (and other proprietary structural signals) deviate from the expected stable manifold, the system flags a potential hallucination before it manifests in the output.

DEMO & CODE:

• Demo video: https://www.youtube.com/watch?v=H1_zDC0SXQ8

• GitHub: https://github.com/yubainu/sibainu-engine

• Raw data: raw_logs.csv (full transparency)

LIMITATIONS:

• Tested on Gemma-2B only (2.5B parameters)

• Designed to scale, but needs validation on larger models

• Catches "structurally unstable" hallucinations (about half)

• Best used as first-line defense in ensemble systems

TECHNICAL NOTES:

• No external models needed (unlike self-consistency methods)

• No knowledge bases required (unlike RAG approaches)

• Adds ~1% inference time vs. 300-500% for semantic methods

• Works by monitoring the process not the product

I'd love feedback on:

• Validation on larger models (Seeking strategic partnerships and compute resources for large-scale validation.)

• Integration patterns for production systems

• Comparison with other structural approaches

• Edge cases where geometric signals fail

This represents a fundamentally different paradigm: instead of asking "is this text correct?", we ask "was the generation process unstable?" The answer is surprisingly informative.

Happy to discuss technical details in the comments!

Comments

yubainu•1h ago

I’ve been exploring why LLMs "break" during inference. Most current hallucination detection methods look at the final text (semantic analysis) or use another LLM to double-check (self-consistency). These are effective but extremely slow and expensive.

SIB-ENGINE is my attempt to solve this at the geometric layer. By monitoring the "Anchor Drift" (how hidden states deviate from the prompt’s latent trajectory), I found that hallucinations often manifest as a structural instability before the token is even sampled.

The Numbers:

Recall: 53.89% (It catches about half, but it's consistent)

Precision: 88.52% (Low false-alarm rate is my priority)

Overhead: <1% (Running on an RTX 3050 with 4GB VRAM)

AUC: 0.8995

I've released a Lite version (1-axis) on GitHub so you can see the fundamental logic and run it on your own machine. I’ve also included the raw_logs.csv from my N=1000 test run on Gemma-2B for full transparency.

I’m particularly curious if anyone here has experimented with similar geometric approaches or has thoughts on how this might scale to 70B+ models where the latent space is significantly denser.

Happy to dive into the technical details!

Show HN: KeychainPGP – Copy, Encrypt, Paste. Simple PGP for the Rest of Us

Nation chip pact takes shape: Japan's capital, Taiwan's IP, India's talent

AI Is a Lethal Threat This Year

Show HN: I built an ML stock picker that runs daily on a single server

Show HN: Open-Source EU AI Act Scanner for Python AI Projects

Caught in the Hook: RCE and API Token Exfiltration Through Claude Code

Show HN: Solving "unknown unknowns" while studying with Claude Code

I think AI is swapping code debt for tooling debt

BrAIn: Persistent, Human-Inspired Memory for LLM Agents

Microsoft breaks email delivery to hotmail/outlook/live.com

Show HN: CivBench a long-horizon AI benchmark for multi-agent games

Quartz: A signet-based proof-of-personhood system

Zelda's Z-Targeting

Data center developers asked Trump for an exemption from pollution rules

The quixotic team trying to build a world in a 20-year-old game

How to Boost Your OpenClaw Bot 10x

Largest coral colony discovered off Australian by mother-daughter team

If AI is mission-critical for your business, build your own feedback loop

Net widening of Southern California beaches

Cloudflare: Run 15x more Containers with higher resource limits

CA Taxes and Cost of Living

Show HN: MCPSpec – Ship reliable MCP servers without writing test code

BuildKit: Docker's hidden gem that can build almost anything

What CI Looks Like at a 100-Person Team

Ask HN: Is RAG an antipattern for AI agents?

The Inversion: What Remains of Research When the Machine Can Ideate?

Covid's origins: what we do and don't know

Sub-second volumetric 3D printing by synthesis of holographic light fields

The AI Bubble Is Bursting

Show HN: I built an engine to reverse-engineer car dealership lease math