frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Regression tests for detecting cross-domain hallucinations in LLMs

1•Ginsabo•1h ago
LLMs sometimes generate structurally valid but logically impossible claims when technical and legal domains mix.

Example failure mode: A model sees “CVE-2024-XXXX fixed in v2.1” and hallucinates a causal link to “Users must pay retroactive fees under EU regulation Article 56.”

To explore this, I built a regression dataset (40 edge cases) covering:

Fake identifier bindings (CVE + version)

Retroactive fiscal claims

Cross-domain causality leaps (Tech → Legal)

Over-assertive phrasing without evidence

Then I designed a structured system prompt that:

Detects official identifiers (CVE, Regulation numbers) vs placeholders

Flags monetary + retroactivity combinations as high-risk

Enforces proportional claim strength based on available evidence

Results:

Automated: 40/40 regression cases pass (JSON dataset + simple Python runner included).

Manual adversarial: ~40 prompts designed to test:

Draft article traps (e.g., hallucinated “Article 52c” in EU AI Act)

Pricing model fabrications (e.g., “billing based on parameter count”)

Version binding errors (e.g., incorrect Node.js default versions)

This is not fine-tuning—just a structured prompt experiment focused on structural validation.

Looking for feedback on:

Missing edge cases

Failure modes I didn’t consider

Whether this approach generalizes beyond legal/technical mixing

Gist (spec + dataset + runner): https://gist.github.com/ginsabo/6ebeb9490846ee6a268bd13560c0...

Comments

13pixels•1h ago
This is a great dataset. The 'cross-domain causality leap' is something we see constantly in brand monitoring—e.g. an LLM seeing a pricing page for 'Product A' and a feature list for 'Product B' and confidently asserting 'Product A has Feature B for $X'.

One edge case you might want to add: *Temporal Merging*. We often see models take a '2024 Roadmap' and a '2023 Release Note' and halluncinate that the roadmap features were released in 2023. It's valid syntax, valid entities, but impossible chronology.

Are you planning to expand this to RAG-specific failures (where the context retrieval causes the mix-up) or focusing purely on model-internal logic gaps?

Ginsabo•1h ago
That's a great example — the "Product A + Product B pricing merge" is exactly the kind of structurally valid but impossible composition I was trying to isolate.

I really like the "Temporal Merging" framing. You're right: roadmap + release notes = syntactically consistent, entity-valid, but chronologically impossible.

I haven't explicitly modeled temporal integrity yet, but that seems like a natural extension of the cross-domain tests.

Regarding RAG: So far the focus has been on model-internal structural logic gaps. I haven't built retrieval-aware tests yet.

That said, I suspect many RAG failures are just amplified cross-document merging errors, so a temporal integrity layer might actually generalize well there.

If you have examples from brand monitoring contexts, I'd love to add them as new regression cases.

Show HN: Post-Interface Design – A manifesto on the abolition of UI

1•andreabergonzi•1m ago•0 comments

Sometimes giving syndication feed readers good errors is a mistake

https://utcc.utoronto.ca/~cks/space/blog/web/FeedReaderErrorsProblemII
1•LorenDB•2m ago•0 comments

Generating vector embeddings for semantic search locally

https://theconsensus.dev/p/2026/02/15/generating-vector-embeddings-locally.html
1•ibobev•3m ago•0 comments

Factorio, Mutation, & Lossiness

https://www.dgtlgrove.com/p/factorio-mutation-and-lossiness
1•ibobev•4m ago•0 comments

MCP and REST Face-Off

https://ilearnt.com/blog/mcpvsrest/
2•LorenDB•5m ago•0 comments

Can Opus 4.6 Do Category Theory in Lean?

https://www.stephendiehl.com/posts/lean-opus-blog/
4•ibobev•5m ago•0 comments

Why my favorite Linux distro is slowing down – and I'm thrilled about it

https://www.zdnet.com/article/whats-coming-next-for-linux-mint/
2•CrankyBear•6m ago•2 comments

Show HN: Represent Me – AI proxy for job seekers with inline fact-verification

https://www.represent-me.ai/
2•mdukefirst•6m ago•0 comments

A DU Paleoclimatologist Explains This Year's Winter in Colorado

https://www.du.edu/news/record-warmth-little-snowfall-du-paleoclimatologist-explains-years-winter...
2•mooreds•6m ago•0 comments

What Playing with Sun Ra in College Taught Me About Myself

https://lithub.com/what-playing-with-sun-ra-in-college-taught-me-about-myself/
2•mooreds•7m ago•0 comments

Measuring Work from Home

https://www.nber.org/papers/w33508
1•mooreds•7m ago•0 comments

Perkin-Elmer Photo Gallery

https://www.chiphistory.org/product/perkin-elmer-photo-gallery
1•tliltocatl•7m ago•0 comments

How Well Does AI Find Code Vulnerabilities?

https://ericfriese.substack.com/p/how-well-does-ai-find-code-vulnerabilities
1•weagle05•8m ago•0 comments

Context Corrosion: A New Attack Vector Against AI Reasoning Systems

https://medium.com/@madhusudan.gopanna/context-corrosion-a-reflective-account-of-ai-reasoning-vul...
1•mgopanna•8m ago•1 comments

Dwarkesh Patel's 2026 Podcast with Dario Amodei

https://thezvi.substack.com/p/on-dwarkesh-patels-2026-podcast-with
2•7777777phil•8m ago•0 comments

Fund Beating 99% of Peers Sees Few Software Firms Surviving AI

https://www.bloomberg.com/news/articles/2026-02-16/fund-beating-99-of-peers-sees-few-software-fir...
3•wslh•9m ago•1 comments

Dropin replacement datum-diff available for the deep-diff JavaScript library

https://github.com/therohk/datum-merge/blob/main/src/diff-lib/README.md
1•coronus456•10m ago•0 comments

Show HN: Meshcraft – Turn any photo into a 3D model in seconds

1•otmardev•12m ago•0 comments

The Quality Cost of the AI Vampire

https://forge-quality.dev/articles/quality-cost-of-ai-vampire
2•nadis•13m ago•0 comments

Journalism schools are teaching fear of the future: Letter from the Editor

https://www.cleveland.com/news/2026/02/journalism-schools-are-teaching-fear-of-the-future-letter-...
1•Tomte•13m ago•0 comments

Show HN: Kuberoku, A CLI to treat vanilla Kubernetes like Heroku

https://github.com/amanjain/kuberoku
1•jainamankumar•14m ago•0 comments

Show HN: Million Dollar Chat

https://milliondollarchat.com
2•shrink•14m ago•0 comments

Zero Knowledge (About) Encryption: Password manager attack via malicious server

https://zkae.io/
1•lorenz_li•15m ago•0 comments

Hex – Mac app with hotkey to transcribe your voice wherever you're typing

https://github.com/kitlangton/Hex
2•rahimnathwani•17m ago•0 comments

A custom app creation Renaissance

https://brentwbenson.org/posts/custom-app-enablement/
1•bwbenson•17m ago•0 comments

An Exercise in Agentic Coding: AV1 Encoder from Scratch in Rust

https://caricio.com/blog/an-exercise-in-agentic-coding-av1-encoder-from-scratch-in-rust/
1•englishm•17m ago•0 comments

Personal Software

https://mattspear.co/blog/personal-software
1•mattspear•18m ago•0 comments

Plan 9 Desktop Guide

https://pspodcasting.net/dan/blog/2019/plan9_desktop.html
2•tosh•18m ago•0 comments

History of AT&T Long Lines

https://telephoneworld.org/long-distance-companies/att-long-distance-network/history-of-att-long-...
1•p_ing•18m ago•0 comments

devlog

https://wiki.xxiivv.com/site/devlog
1•tosh•21m ago•0 comments