frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Apache NetBeans 29 Released

https://netbeans.apache.org/front/main/download/nb29/
1•birdculture•1m ago•0 comments

Boogiebench: Evaluating models' ability to write music

https://www.boogiebench.com
1•tintinnabula•1m ago•0 comments

NPR Finds 53 Missing 'Trump' Pages – The DOJ Has No Explanation

https://www.mediaite.com/opinion/epstein-files-npr-finds-53-missing-trump-pages-the-doj-has-no-ex...
1•Betelbuddy•2m ago•0 comments

The Rejection of Artificially Generated Slop (Rags)

https://406.fail/
1•signa11•2m ago•0 comments

Show HN: GhostVM – native macOS VMs for secure dev and isolated agent workflows

https://github.com/groundwater/GhostVM
1•JacobDivbyzero•4m ago•0 comments

The paradox of Bangladesh's democratic rebirth

https://globalvoices.org/2026/02/14/the-paradox-of-bangladeshs-democratic-rebirth-a-critical-anal...
1•PaulHoule•4m ago•0 comments

Show HN: Idea Reality MCP – Pre-build reality check for AI coding agents

https://github.com/mnemox-ai/idea-reality-mcp
1•mnemoxai•4m ago•0 comments

Show HN: Tag Promptless on any GitHub PR/Issue to get updated user-facing docs

2•prithvi2206•5m ago•0 comments

Show HN: Emdash – Open-source agentic development environment

https://github.com/generalaction/emdash
1•onecommit•5m ago•0 comments

Pentagon, Musk's xAI reach agreement to use Grok in classified systems

https://www.aa.com.tr/en/americas/pentagon-musk-s-xai-reach-agreement-to-use-grok-in-classified-s...
1•Betelbuddy•6m ago•0 comments

Former Norwegian premier hospitalized after suicide attempt amid Epstein charges

https://www.aa.com.tr/en/europe/former-norwegian-premier-hospitalized-after-suicide-attempt-amid-...
2•Betelbuddy•9m ago•0 comments

Jira Ticket Analysis Web App (Free)

https://jiralens.com/
1•thebitvader•9m ago•1 comments

Myelin repair promoted by clemastine fumarate in nonhuman primate model

https://www.pnas.org/doi/10.1073/pnas.2520161123
2•bikenaga•9m ago•0 comments

One workspace for inspiration, intelligence, and creation

https://www.inspoai.io
1•sendnow•10m ago•0 comments

Show HN: Unthumb – Replace YT thumbnails with frames from the video

https://chromewebstore.google.com/detail/unthumb-hide-and-replace/ihibeclkodckpjfiihkcdhejpcielpcl
1•philcunliffe•10m ago•0 comments

STARC framework for Bank-Fintech risk management

https://www.independentbanker.org/w/starc-framework-for-bank-fintech-risk-management
1•petethomas•11m ago•0 comments

Emissaries – Constitutional principles for personal agents

https://commontask.org/emissaries/
2•durakot•11m ago•0 comments

Querying 3B Vectors

https://vickiboykis.com/2026/02/21/querying-3-billion-vectors/
1•mooreds•11m ago•0 comments

Finding Hidden Cloud Savings

https://newsletter.masterpoint.io/p/finding-hidden-cloud-savings
1•mooreds•12m ago•0 comments

Anthropic accuses China of 'industrial scale' attempt to steal Claude

https://www.neowin.net/news/anthropic-accuses-china-of-industrial-scale-attempt-to-steal-claudes-...
2•bundie•12m ago•0 comments

Least Privilege Manifesto

https://www.osohq.com/post/least-privilege-manifesto
2•boristane•12m ago•0 comments

Show HN: LoMux – Lightweight FFmpeg GUI in Rust (3MB Binary)

https://github.com/zblauser/LoMux
1•selectedambient•12m ago•0 comments

Sonic Attack on a Silent Vigil

https://earshotngo.substack.com/p/sonic-attack-on-a-silent-vigil
2•moxifly7•16m ago•0 comments

Re-thinking candidate take-homes in the AI Era: transcripts over code

https://rootly.com/blog/re-thinking-candidates-take-homes-in-the-ai-era-transcripts-over-code
1•jjtang1•16m ago•0 comments

1Password Raising Prices ~33%

5•iamben•18m ago•1 comments

Workaholic open source developers need to take breaks

https://www.theregister.com/2026/02/23/open_source_devs_column/
1•CrankyBear•18m ago•0 comments

Tritone Substitution

https://www.johndcook.com/blog/2026/02/23/tritone-sub/
1•ibobev•20m ago•0 comments

Giant Steps

https://www.johndcook.com/blog/2026/02/23/giant-steps/
1•ibobev•20m ago•0 comments

Formal determination of deidentification under California law

https://www.johndcook.com/blog/2026/02/23/copy-and-paste-law/
1•ibobev•21m ago•0 comments

Takeaways of building an MCP Server for my app

https://tagstack.io/blog/mcp-for-tagstack
1•greatNespresso•21m ago•0 comments
Open in hackernews

Show HN: Detect LLM hallucinations via geometric drift (0.9 AUC, 1% overhead)

https://github.com/yubainu/sibainu-engine
1•yubainu•2h ago
I built SIB-ENGINE, a real-time hallucination detection system that monitors LLM internal structure rather than output content.

KEY RESULTS (Gemma-2B, N=1000): • 54% hallucination detection with 7% false positive rate • <1% computational overhead (runs on RTX 3050 with 4GB VRAM) • ROC-AUC: 0.8995

WHY IT'S DIFFERENT: Traditional methods analyze the output text semantically. SIB-ENGINE monitors "geometric drift" in hidden states during generation - identifying the structural collapse of the latent space before the first incorrect token is sampled.

This approach offers unique advantages: • Real-time intervention: Stop generation mid-stream • Language-agnostic: No semantic analysis needed • Privacy-preserving: Never reads the actual content • Extremely lightweight: Works on consumer hardware

HOW IT WORKS: SIB-ENGINE monitors the internal stability of the model's computation. While the system utilizes multiple structural signals to detect instability, two primary indicators include:

Representation Stability: Tracking how the initial intent is preserved or distorted as it moves through the model's transformation space.

Cross-Layer Alignment: Monitoring the consensus of information processing across different neural depths to identify early-stage divergence.

When these (and other proprietary structural signals) deviate from the expected stable manifold, the system flags a potential hallucination before it manifests in the output.

DEMO & CODE: • Demo video: https://www.youtube.com/watch?v=H1_zDC0SXQ8 • GitHub: https://github.com/yubainu/sibainu-engine • Raw data: raw_logs.csv (full transparency)

LIMITATIONS: • Tested on Gemma-2B only (2.5B parameters) • Designed to scale, but needs validation on larger models • Catches "structurally unstable" hallucinations (about half) • Best used as first-line defense in ensemble systems

TECHNICAL NOTES: • No external models needed (unlike self-consistency methods) • No knowledge bases required (unlike RAG approaches) • Adds ~1% inference time vs. 300-500% for semantic methods • Works by monitoring the process not the product

I'd love feedback on: • Validation on larger models (Seeking strategic partnerships and compute resources for large-scale validation.) • Integration patterns for production systems • Comparison with other structural approaches • Edge cases where geometric signals fail

This represents a fundamentally different paradigm: instead of asking "is this text correct?", we ask "was the generation process unstable?" The answer is surprisingly informative.

Happy to discuss technical details in the comments!

Comments

yubainu•2h ago
I’ve been exploring why LLMs "break" during inference. Most current hallucination detection methods look at the final text (semantic analysis) or use another LLM to double-check (self-consistency). These are effective but extremely slow and expensive.

SIB-ENGINE is my attempt to solve this at the geometric layer. By monitoring the "Anchor Drift" (how hidden states deviate from the prompt’s latent trajectory), I found that hallucinations often manifest as a structural instability before the token is even sampled.

The Numbers:

Recall: 53.89% (It catches about half, but it's consistent)

Precision: 88.52% (Low false-alarm rate is my priority)

Overhead: <1% (Running on an RTX 3050 with 4GB VRAM)

AUC: 0.8995

I've released a Lite version (1-axis) on GitHub so you can see the fundamental logic and run it on your own machine. I’ve also included the raw_logs.csv from my N=1000 test run on Gemma-2B for full transparency.

I’m particularly curious if anyone here has experimented with similar geometric approaches or has thoughts on how this might scale to 70B+ models where the latent space is significantly denser.

Happy to dive into the technical details!