frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

HalluciGuard – open-source middleware to detect and mitigate LLM hallucinations

https://github.com/Hermes-Lekkas/HalluciGuard
2•Hermes_dev•1h ago

Comments

Hermes_dev•1h ago

  Hi HN,


  Hallucinations remain the single biggest bottleneck for moving LLM applications from "cool demo" to "reliable production." Whether it’s a RAG pipeline inventing
  citations or an autonomous agent fabricating data, the lack of a reliable "truth layer" is costing companies billions in trust and market cap.


  I’m sharing HalluciGuard, an open-source middleware layer (AGPLv3) designed to act as a security and reliability buffer between LLM providers and your end-users.

  GitHub: https://github.com/Hermes-Lekkas/HalluciGuard (https://github.com/Hermes-Lekkas/HalluciGuard)


  How it works
  Instead of just hoping the model is right, HalluciGuard intercepts the LLM response and runs it through a multi-signal verification pipeline:


   1. Factual Claim Extraction: It uses lightweight LLMs to break down a response into discrete, verifiable factual claims.
   2. Multi-Signal Scoring: It evaluates each claim using four distinct signals:
       * Self-Consistency: LLM-as-a-judge verification.
       * Linguistic Heuristics: Detecting "uncertainty language" and high-risk patterns.
       * RAG Cross-Reference: Verifying claims directly against your retrieved documents.
       * Web Verification: Optionally pulling real-time snippets from search engines (like Tavily).
   3. Risk Flagging: It returns a comprehensive GuardedResponse with an overall "Trust Score" and flags specific claims as SAFE, MEDIUM, or CRITICAL risk.


  Key Features
   * Provider Agnostic: Native support for OpenAI (GPT-5.x), Anthropic (Claude 4.6), Google (Gemini 3.1), and local models via Ollama.
   * OpenClaw Integration: We’ve built a native interceptor for the OpenClaw agent framework, allowing you to monitor agent actions and thoughts in real-time.
   * Streaming Support: Performs analysis asynchronously so you don't lose the "real-time" feel of streaming responses.
   * Cost-Optimization Cache: We cache verification results locally to reduce your API bills by avoiding redundant checks for common facts.
   * LangChain Ready: Includes a drop-in CallbackHandler for existing LangChain projects.

  Why AGPLv3?
  We believe the "Truth Layer" of the AI stack should be owned by the community, not hidden behind a proprietary corporate API.


  What’s Next?
  We are currently working on "Lookahead Verification" (v0.9), which will attempt to auto-correct hallucinations during the token generation phase before the user
  ever sees them.


  I'd love to get the community's feedback on our scoring heuristics and hear about the edge cases you're seeing in production.
Happy to answer any technical questions about the architecture or the benchmark results we've seen so far!
verdverm•1h ago
The feedback I have is that HN is not real interested in projects that are started less than an hour ago (see git history), with an even newer HN account and submission.

see also: https://news.ycombinator.com/item?id=47089907

eBay buys Depop for $1.2B in effort to lure younger shoppers

https://www.theguardian.com/technology/2026/feb/19/ebay-buys-depop-from-etsy
1•iamben•28s ago•0 comments

I Let Claude Read My Email

https://ericbrookfield.com/2026/02/20/i-let-claude-read-my-email/
1•surprisetalk•2m ago•0 comments

The Unbearable Weight of Cruft

https://www.joanwestenberg.com/the-unbearable-weight-of-cruft/
1•zdw•2m ago•0 comments

Cybernetic practices for design research pedagogy (2023)

https://onlinelibrary.wiley.com/doi/10.1002/sres.2974
1•andsoitis•3m ago•0 comments

Show HN: Routype – typed REST client in ~200 lines, no codegen

https://github.com/jbingen/routype
1•jbingen•4m ago•0 comments

Irish man detained by ICE [Update] – It's not what it seems

https://www.limerickleader.ie/news/national-news/2018902/daughter-of-man-detained-in-the-us-says-...
5•cauliflower99•7m ago•0 comments

Agent Compromised by Agent to Deploy an Agent

https://www.mbgsec.com/posts/2026-02-19-agent-repo-compromised-by-agent-to-install-an-agent/
2•chha•8m ago•0 comments

DHS Admits Its Website the 'Worst of the Worst' Immigrants Was Rife with Errors

https://www.cnn.com/2026/02/19/politics/homeland-security-worst-immigrants-website
2•TigerUniversity•10m ago•0 comments

The Stanford Emerging Technology Review 2026 [pdf]

https://setr.stanford.edu/sites/default/files/2026-01/SETR2026_web-260109.pdf
3•cantaloupe•11m ago•0 comments

How to Die Optimally – A Theory of Consumption When AI Takes Your Job

https://ngrislain.github.io/static/projects/ai-economics/ai-economics.html
2•ngrislain•13m ago•0 comments

ATAboy is a USB adapter for legacy CHS only style IDE (PATA) drives

https://github.com/redruM0381/ATAboy
3•zdw•15m ago•0 comments

Your tech or my tech: make up your mind quickly (2024)

https://berthub.eu/articles/posts/your-tech-my-tech/
3•pabs3•17m ago•0 comments

Show HN: Murl – Curl for MCP Servers

https://github.com/turlockmike/murl
4•turlockmike•19m ago•0 comments

Fork, Explore, Commit: OS Primitives for Agentic Exploration

https://arxiv.org/abs/2602.08199
3•wang_cong•19m ago•0 comments

Show HN: Are – Rule engine for JavaScript, C#, and Dart with playground

https://are-playground.netlify.app/
4•beratarpa•19m ago•0 comments

Show HN: AI Council – multi-model deliberation that runs in the browser

https://github.com/prijak/Ai-council
5•prijak•22m ago•0 comments

The decline of single-earner housebuyers in America

https://www.economist.com/united-states/2026/02/12/the-decline-of-single-earner-housebuyers-in-am...
4•hhs•25m ago•0 comments

Fediverse Discovery Providers

https://www.fediscovery.org/
2•riffraff•27m ago•0 comments

Org Structure Is My Opportunity

https://writing.nikunjk.com/p/your-org-structure-is-my-opportunity
4•walterbell•28m ago•0 comments

Google Lyria 3: Create custom tracks for any moment

https://gemini.google/overview/music-generation/
2•thatxliner•29m ago•0 comments

AI dev tool power rankings and comparison [Feb. 2026]

https://blog.logrocket.com/ai-dev-tool-power-rankings/
2•snowhale•30m ago•0 comments

Show HN: Natural language search across Kalshi and Polymarket (API and MCP)

2•helloiamvu•31m ago•0 comments

Piantor Pro Review: My RSI Journey and Switching to a 36-Key Keyboard

https://jovianmoon.io/posts/rsi-36-key-keyboard
2•fireflyman•31m ago•0 comments

Show HN: Open a Linux Container (for Mac)

2•dpweb•32m ago•0 comments

Flexport's take on the Supreme Court ruling on tariffs: What's next? Refunds?

https://www.flexport.com/blog/the-supreme-courts-ieepa-tariff-ruling-next-steps-potential-refunds...
3•stingrae•37m ago•0 comments

The Russian village that lost its men to war

https://www.bbc.com/news/articles/ce8n4l8elpgo
3•breve•38m ago•0 comments

The enviromental impact of using LLMs for writing code

https://treyhunner.com/2026/02/on-the-enviromental-impact-of-llms-for-coding/
3•lumpa•39m ago•1 comments

Xkcd: Suspicion

https://xkcd.com/632/
5•ravenical•40m ago•1 comments

TikToker Khaby Lame's $975M deal is riding on a crashing stock

https://www.businessinsider.com/tiktoker-khaby-lame-975-million-deal-riding-on-falling-stock-2026-2
2•pseudolus•45m ago•0 comments

The Quest for Clean Cargo

https://www.noemamag.com/the-quest-for-clean-cargo/
2•bookofjoe•46m ago•0 comments