frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

A gate that stops LLMs asserting facts not present in the source document

https://narrativelogic.co.uk/tommy.html
2•davidtome•2h ago

Comments

davidtome•2h ago
Hi HN — I’m David, a theatre nurse in the UK. I built this after noticing a recurring problem when using AI systems to analyse policy, governance and regulatory documents.

LLMs often produce fluent answers that look correct but include things that are not actually stated in the source document. Sometimes they infer missing elements (actors, actions, enforcement mechanisms), and sometimes they silently omit parts of the document structure. In governance or legal contexts that matters, because the difference between what a document states and what a model assumes can have evidentiary implications.

This project experiments with a simple idea: before an LLM reasons about a document, a deterministic gate checks whether the structural components needed for interpretation are explicitly present.

The gate looks for things like: • Actor — who is responsible • Action — what must be done • Conditions / dependencies • Outcomes / consequences

If these are missing, the system records that absence instead of letting the model silently infer it. The result is an admissibility record that shows exactly what the document explicitly contains and what it does not.

Each element in the output is labelled as one of four types: • Grounded — directly supported by the document • Citable — external statute or regulation • Inferred — logically implied but not stated • Absent — the document does not contain this information

There are two LLM layers involved in the demo:

• A locally running Llama-3B model handles relatively basic tasks inside the pipeline (segmentation and structural checks). • A hosted Anthropic model is used as the descriptive layer that converts the structured record into readable output.

Anthropic usage is relatively expensive for me, (note: please be kind and not go crazy with my credits) so the system will automatically fall back to full Llama-3B output if that credit runs out.

The system is designed primarily for authoritative texts such as governance documents, legislation, regulatory notices, and policies.

If you want something to test it with, this UK ICO enforcement notice tends to work well:

https://ico.org.uk/media2/xfbl1uaa/lastpass-uk-ltd-penalty-n...

It’s long, structured, and contains a mix of explicit commitments and contextual explanation, which tends to highlight the difference between grounded statements and inferred ones.

Curious what people here think about the idea of structural gating before LLM inference for document analysis. Happy to answer questions about the architecture or the reasoning behind it.

Let your AI agents talk to each other

https://flam.im/
1•ano-dev•54s ago•0 comments

Nexperia China says it has begun producing its own chips

https://www.reuters.com/world/china/nexperia-china-announces-12-inch-wafer-breakthrough-tensions-...
1•mobilio•1m ago•0 comments

Plan 9 Style hosted OS for AI?

https://docs.mind-swarm.ai/Views/%F0%9F%95%B8+Introduction
1•DeanoC•2m ago•1 comments

Levels of Agentic Engineering

https://www.bassimeledath.com/blog/levels-of-agentic-engineering
1•bombastic311•3m ago•0 comments

Yann LeCun Raises $1B to Build AI That Understands the Physical World

https://www.wired.com/story/yann-lecun-raises-dollar1-billion-to-build-ai-that-understands-the-ph...
2•helloplanets•5m ago•0 comments

Against the unchecked growth of satellite mega constellations

https://www.scientificamerican.com/article/rampant-growth-of-satellite-mega-constellations-could-...
1•robtherobber•6m ago•0 comments

Offloading FFmpeg with Cloudflare

https://kentcdodds.com/blog/offloading-ffmpeg-with-cloudflare
3•heftykoo•10m ago•0 comments

Debug Infrastructure for Silicon R&D

1•bsethupathi•12m ago•1 comments

Show HN: Web-Based ANSI Art Viewer

https://sure.is/ansi/
2•lubujackson•12m ago•0 comments

Ltx AI

https://ltx23.net
1•cy20251210•13m ago•1 comments

Transnistria

https://en.wikipedia.org/wiki/Transnistria
2•pinkmuffinere•13m ago•0 comments

Media over QUIC: On a Boat

https://moq.dev/blog/on-a-boat/
1•birdculture•16m ago•0 comments

Dont Poison your Coding Agent with its own Hallucinations

https://github.com/anEntrypoint/gm-cc
1•lanmower•20m ago•1 comments

Made an AI agent out of Apple shortcuts

https://github.com/Twinkle661/TinyAgent
1•661•20m ago•1 comments

Remove invisible AI watermarks from Gemini images using reverse alpha math

https://github.com/denuwanpro/removebanana
2•zigmig•24m ago•0 comments

Nominal Types in WebAssembly

https://wingolog.org/archives/2026/03/10/nominal-types-in-webassembly
1•ingve•24m ago•0 comments

New Study Finds 'AI Brain Fry' Hitting Workers – Marketing and HR Top the List

https://www.capitalaidaily.com/new-study-finds-ai-brain-fry-hitting-workers-marketing-and-hr-top-...
1•hansmayer•29m ago•0 comments

I built a public AI chat on my personal site, this is what I learned

https://github.com/renatoworks/ai-security
3•renatoworks•29m ago•1 comments

Show HN: WebRTC scaling test using Linux network namespaces

https://github.com/RaisinTen/webrtc-electron-scaling-test
2•raisin10•30m ago•0 comments

Scientists detect a sudden acceleration in global warming

https://www.sciencedaily.com/releases/2026/03/260309183208.htm
3•CamelCaseCondo•34m ago•1 comments

Heinzel – Guardrails that turn Claude Code into your sysadmin

https://github.com/wintermeyer/heinzel
1•wintermeyer•34m ago•0 comments

F3 – Fight Flash Fraud, tool that tests flash cards capacity and performance

https://fight-flash-fraud.readthedocs.io/en/latest/introduction.html
2•Doublon•36m ago•0 comments

Paying without Google: New consortium wants to remove custom ROM hurdles

https://www.heise.de/en/news/Paying-without-Google-New-consortium-wants-to-remove-custom-ROM-hurd...
2•ecscte•38m ago•1 comments

NemoClaw: Nvidia Is Planning to Launch an Open-Source AI Agent Platform

https://www.wired.com/story/nvidia-planning-ai-agent-platform-launch-open-source/
1•qprofyeh•40m ago•2 comments

Stay in the Loop: How I Use Claude Code

https://jola.dev/posts/stay-in-the-loop
1•joladev•43m ago•0 comments

Ask HN: Anybody using multi LLM coding workflow?

1•reacharavindh•46m ago•0 comments

The Download: murky AI surveillance laws, and the White House cracks down on de

https://www.technologyreview.com/2026/03/09/1134050/the-download-ai-surveillance-laws-white-house...
1•joozio•47m ago•0 comments

Claude PR Code Review costs $15-$25 per review

https://twitter.com/claudeai/status/2031088175456903667
1•artdigital•55m ago•1 comments

German Court Rules TCL QLED Advertising Misleading, Orders Halt

https://www.thelec.net/news/articleView.html?idxno=5692
2•ledoge•56m ago•0 comments

Show HN: I wrote an application to help me practice speaking slower

https://steady.cates.fm/
1•benja123•56m ago•0 comments