KEY RESULTS (Gemma-2B, N=1000):
• 54% hallucination detection with 7% false positive rate
• <1% computational overhead (runs on RTX 3050 with 4GB VRAM)
• ROC-AUC: 0.8995
WHY IT'S DIFFERENT:
Traditional methods analyze the output text semantically.
SIB-ENGINE monitors "geometric drift" in hidden states during generation - identifying the structural collapse of the latent space before the first incorrect token is sampled.
This approach offers unique advantages:
• Real-time intervention: Stop generation mid-stream
• Language-agnostic: No semantic analysis needed
• Privacy-preserving: Never reads the actual content
• Extremely lightweight: Works on consumer hardware
HOW IT WORKS: SIB-ENGINE monitors the internal stability of the model's computation. While the system utilizes multiple structural signals to detect instability, two primary indicators include:
Representation Stability: Tracking how the initial intent is preserved or distorted as it moves through the model's transformation space.
Cross-Layer Alignment: Monitoring the consensus of information processing across different neural depths to identify early-stage divergence.
When these (and other proprietary structural signals) deviate from the expected stable manifold, the system flags a potential hallucination before it manifests in the output.
DEMO & CODE:
• Demo video: https://www.youtube.com/watch?v=H1_zDC0SXQ8
• GitHub: https://github.com/yubainu/sibainu-engine
• Raw data: raw_logs.csv (full transparency)
LIMITATIONS:
• Tested on Gemma-2B only (2.5B parameters)
• Designed to scale, but needs validation on larger models
• Catches "structurally unstable" hallucinations (about half)
• Best used as first-line defense in ensemble systems
TECHNICAL NOTES:
• No external models needed (unlike self-consistency methods)
• No knowledge bases required (unlike RAG approaches)
• Adds ~1% inference time vs. 300-500% for semantic methods
• Works by monitoring the process not the product
I'd love feedback on:
• Validation on larger models (Seeking strategic partnerships and compute resources for large-scale validation.)
• Integration patterns for production systems
• Comparison with other structural approaches
• Edge cases where geometric signals fail
This represents a fundamentally different paradigm: instead of asking "is this text correct?", we ask "was the generation process unstable?" The answer is surprisingly informative.
Happy to discuss technical details in the comments!
yubainu•1h ago
SIB-ENGINE is my attempt to solve this at the geometric layer. By monitoring the "Anchor Drift" (how hidden states deviate from the prompt’s latent trajectory), I found that hallucinations often manifest as a structural instability before the token is even sampled.
The Numbers:
Recall: 53.89% (It catches about half, but it's consistent)
Precision: 88.52% (Low false-alarm rate is my priority)
Overhead: <1% (Running on an RTX 3050 with 4GB VRAM)
AUC: 0.8995
I've released a Lite version (1-axis) on GitHub so you can see the fundamental logic and run it on your own machine. I’ve also included the raw_logs.csv from my N=1000 test run on Gemma-2B for full transparency.
I’m particularly curious if anyone here has experimented with similar geometric approaches or has thoughts on how this might scale to 70B+ models where the latent space is significantly denser.
Happy to dive into the technical details!