frontpage.

Show HN: Lucid – Catch hallucinations in AI-generated code before they ship

https://github.com/gtsbahamas/hallucination-reversing-system

3•jordanappsite•2h ago

Hi HN, I'm Ty. I built LUCID because I kept shipping bugs that my AI coding assistant hallucinated into existence.

Three independent papers have proven that LLM hallucination is mathematically inevitable (Xu et al. 2024, Banerjee et al. 2024, Karpowicz 2025). You can't train it away. You can't prompt it away. So I built a verification layer instead.

How it works: LUCID extracts implicit claims from AI-generated code (e.g., "this function handles null input," "this query is injection-safe," "this handles concurrent access"), then uses a second, adversarial AI pass to verify each claim against the actual implementation. You get a report showing exactly what would have shipped to production without verification.

"But can't the verifier hallucinate too?" Yes -- and that's the right question. The benchmarks below were validated by running real test suites, not by trusting LUCID's judgment. The value is that structured claim extraction + adversarial verification catches bugs that a single generation pass misses. The architecture also supports swapping LLM verification for formal methods (SMT solvers, property-based testing) per claim type as those integrations mature.

Benchmarks:

- HumanEval: 86.6% baseline -> 100% pass@5 with LUCID (164/164 problems) - SWE-bench: 18.3% baseline -> 30.3% with LUCID (+65.5%) - Both benchmarks were validated by running actual test suites, not by LLM judgment - LLM-as-judge actually performs worse at higher k values -- it hallucinates false positives

Three ways to use it:

1. MCP Server (Claude Code, Cursor, Windsurf) -- one config line, verification as a native tool 2. GitHub Action -- automated verification on every PR with inline comments 3. CLI -- npx lucid verify --repo /path/to/code

Free tier: 100 verifications/month. Get a key at https://trylucid.dev

Code: https://github.com/gtsbahamas/hallucination-reversing-system Paper: https://doi.org/10.5281/zenodo.18522644 Dashboard: https://trylucid.dev

TypeScript's Power in Plain JavaScript

Show HN: Mdr – TUI Markdown Reader

Context management is the real bottleneck in AI-assisted coding

Insider Analytics – We have built a insider trading tracking platform

Show HN: Scansprout – QR code generator I extracted from an art gallery project

Show HN: DevUtility Hub Source Code – 117 Tools in Next.js 15

MiniMax M2.5 SOTA in Coding and Agent, Designed for Agent Universe

They Asked Me to Open ChatGPT During My Job Interview

ByteDance Seed2.0 LLM: breakthrough in complex real-world tasks

The SEC closed its investigation into Fisker

First Proof

Washington pushes back against EU's bid for tech autonomy

Apple Reveals How Many iPhones Are Running iOS 26

The Final Bottleneck

Show HN: HelloAria – AI task manager where you talk instead of type

Do Not Outsource Judgement

Painless Activation Steering (PAS)

Show HN: Quantitative analysis of Alphabet (GOOGL) financials

I love using TypeScript at work

14 More Lessons from 14 years at Google

Show HN: Swarm Curl

The AI Dilemma

Cyber Model Arena

Pg_stat_ch: A PostgreSQL extension that exports every metric to ClickHouse

Why haven't humans been back to the moon in over 50 years?

Jikipedia, a new AI-powered wiki reporting on key figures in the Epstein scandal

Show HN: Heart Note – a tiny web app to send beautiful one‑off digital letters

SnowBall: Iterative Context Processing When It Won't Fit in the LLM Window

How to be a good Asian parent (satire)

The Compliance Officer Who Flagged Epstein – and Lost Her Job