frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The tool that won't let AI say anything it can't cite

https://github.com/grainulation/grainulator
35•volatilityfund•2d ago

Comments

nnevatie•2d ago
Considering that Claude sometimes confuses the identities of itself and the user, this might as well cite the user - "you just said X".
4ndrewl•2d ago
I tried it with the Car Wash question (it failed) and all it's claims were mostly fuel consumption or emissions related, and this

"factual (ai) Weather, traffic, and personal urgency are the only significant variables that could tilt the decision toward driving."

My gut feeling is that if this could be done, it would be a core part of one of the model provider's output.

Lionga•2d ago
This is akin to writing "No hallucinations" in your proompt. So strange that even HN thinks it is worth anything.
hdemmer•2d ago
Used the demo app:

Q: Who directed Scarface? A: - 1983 film (most commonly referred to): Directed by Brian De Palma. - 1932 original version: Directed by Michael Curtiz.

This is wrong. The 1932 movie is by Howard Hawks.

0x3f•2d ago
Well, I would have tried it but the website kills Firefox.

Hard to see how you could really make this work though. You might as well just add "fetch and re-read all sources explicitly to make sure they are correct" to a normal prompt.

jampekka•2d ago
The HN title is quite a strong claim, but it's nowhere to be seen in the repo.

It seems to be fully prompt based, so the AI still can say anything it pleases.

How well do these complicated prompt systems usually work? My strategy is to stick mostly to just simple prompts with potentially some deterministic tools and vendor harnesses, based on the rationale that these are what the models are trained and evaluated with. And that LLMs still often get tripped up when their context is spammed with too much stuff.

sigmoid10•2d ago
The crazy thing is, you could do this. And it can be done 100% with code using zero prompting - just by limiting the output token set to a structured format and then further constraining parts of that to sources that were retrieved before. I know because I wrote such a system already. It could still match sources and answers incorrectly (just like this approach) but there is no need to rely on crazy prompts and agents to prevent hallucinations or missing outputs (which btw still lack any hard guarantees in the end). Prompting is a good strategy as models become smarter, but when you need reliability, you need to make use of the fact that they are still simple autoregressive completion engines. I don't get why everyone ignores this aspect, since I find it extremely useful all the time.
jampekka•2d ago
> I don't get why everyone ignores this aspect, since I find it extremely useful all the time.

My hunch is because structured/constrained decoding and deterministic subsystems are technically somewhat more involved, requiring e.g. raw API interactions and sometimes manual decoding strategies. Prompt systems can be written in plain text and mostly with "common sense". Not to say writing a good prompt(system) is a trivial task, but it's a different skillset.

sigmoid10•17h ago
Not really. Most big model providers offer structured output decoding in their APIs. But you still have to do some actual programming and design at the end of the day instead of pure vibe-prompting.
Gijs4g•2d ago
The website fully stutters to a halt.

Managed to ask if Ali Khamenei is still alive. It answered "Yes, ..."

tomlockwood•2d ago
I love how at the beginning of this boom people were talking about how heuristics applied to AI outputs were short-term gains disguised as real progress. Now it seems like almost every new tool is a series of heuristics applied to AI outputs.
pjmalandrino•2d ago
Why are you building your own DAG system instead of just using LangGraph? You could cut complexity and focus on what actually matters : the claims, evidence tiers, conflict detection.

Also, embedding claims in the Chain of Thought instead of post-processing them might force rigor earlier in the pipeline.

(Assuming the zero-deps constraint isn't a blocker?)

est•2d ago
Looks like it's just find sources in Confluence against bullshit Claude Code says?

I thought it can search for online cites.

todotask2•2d ago
The interfactive app caused my mouse moving so sluggish on macOS.
doginasuit•2d ago
I'm positive there are use-cases for this tool but after several years of working with LLMs, hallucinations have become a non-issue. You start to get a sense of the likely gaps in their knowledge just like you would a person.

Questions about application settings, for example, where to find a particular setting in a particular app. The LLM has a sense of how application settings are generally structured but the answer is almost never spot on. I just prefix these questions with "do a web search" or provide a link to documentation and that is usually enough to get a decent response along with citations.

Small models also found the vulnerabilities that Mythos found

https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier
1066•dominicq•16h ago•292 comments

I run multiple $10K MRR companies on a $20/month tech stack

https://stevehanov.ca/blog/how-i-run-multiple-10k-mrr-companies-on-a-20month-tech-stack
135•tradertef•2h ago•79 comments

The End of Eleventy

https://brennan.day/the-end-of-eleventy/
153•ValentineC•7h ago•107 comments

Tofolli gates are all you need

https://www.johndcook.com/blog/2026/04/06/tofolli-gates/
38•ibobev•4d ago•6 comments

US appeals court declares 158-year-old home distilling ban unconstitutional

https://www.theguardian.com/law/2026/apr/11/appeals-court-ruling-home-distilling-ban-unconstituti...
131•Jimmc414•3h ago•92 comments

Anthropic silently downgraded cache TTL from 1h → 5M on March 6th

https://github.com/anthropics/claude-code/issues/46829
29•lsdmtme•3h ago•10 comments

How We Broke Top AI Agent Benchmarks: And What Comes Next

https://rdi.berkeley.edu/blog/trustworthy-benchmarks-cont/
366•Anon84•13h ago•93 comments

How Complex is my Code?

https://philodev.one/posts/2026-04-code-complexity/
113•speckx•4d ago•25 comments

Simplest Hash Functions

https://purplesyringa.moe/blog/simplest-hash-functions/
42•ibobev•4d ago•34 comments

Dark Castle

https://darkcastle.co.uk/
180•evo_9•12h ago•23 comments

447 TB/cm² at zero retention energy – atomic-scale memory on fluorographane

https://zenodo.org/records/19513269
210•iliatoli•12h ago•101 comments

Pijul a FOSS distributed version control system

https://pijul.org/
144•kouosi•4d ago•23 comments

Apple Silicon and Virtual Machines: Beating the 2 VM Limit (2023)

https://khronokernel.com/macos/2023/08/08/AS-VM.html
200•krackers•11h ago•138 comments

An Interview with Pat Gelsinger

https://morethanmoore.substack.com/p/an-interview-with-pat-gelsinger-2026
7•zdw•2d ago•1 comments

How a dancer with ALS used brainwaves to perform live

https://www.electronicspecifier.com/products/sensors/how-a-dancer-with-als-used-brainwaves-to-per...
38•1659447091•6h ago•8 comments

Advanced Mac Substitute is an API-level reimplementation of 1980s-era Mac OS

https://www.v68k.org/advanced-mac-substitute/
238•zdw•17h ago•61 comments

Apple update looks like Czech mate for locked-out iPhone user

https://www.theregister.com/2026/04/12/ios_passcode_bug/
7•OuterVale•18m ago•0 comments

Cirrus Labs to join OpenAI

https://cirruslabs.org/
263•seekdeep•19h ago•126 comments

Show HN: Pardonned.com – A searchable database of US Pardons

430•vidluther•1d ago•238 comments

Surelock: Deadlock-Free Mutexes for Rust

https://notes.brooklynzelenka.com/Blog/Surelock
211•codetheweb•3d ago•67 comments

How to build a `Git diff` driver

https://www.jvt.me/posts/2026/04/11/how-git-diff-driver/
112•zdw•14h ago•12 comments

Why meaningful days look like nothing while you are living them

https://pilgrima.ge/p/the-grand-line
34•momentmaker•5h ago•20 comments

The Soul of an Old Machine

https://skalski.dev/the-soul-of-an-old-machine/
53•mskalski•4d ago•11 comments

Software Preservation Group: C++ History Collection

https://softwarepreservation.computerhistory.org/c_plus_plus/
23•quuxplusone•7h ago•2 comments

High-Level Rust: Getting 80% of the Benefits with 20% of the Pain

https://hamy.xyz/blog/2026-01_high-level-rust
35•maxloh•9h ago•34 comments

What is a property?

https://alperenkeles.com/posts/what-is-a-property/
73•alpaylan•4d ago•20 comments

Optimal Strategy for Connect 4

https://2swap.github.io/WeakC4/explanation/
290•marvinborner•3d ago•31 comments

Network Flow Algorithms

https://www.networkflowalgs.com/
6•teleforce•5d ago•0 comments

Every plane you see in the sky – you can now follow it from the cockpit in 3D

https://flight-viz.com/cockpit.html?lat=40.64&lon=-73.78&alt=3000&hdg=220&spd=130&cs=DAL123
323•coolwulf•3d ago•60 comments

The Problem That Built an Industry

https://ajitem.com/blog/iron-core-part-1-the-problem-that-built-an-industry/
127•ShaggyHotDog•18h ago•44 comments