frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The "confident idiot" problem: Why AI needs hard rules, not vibe checks

https://steerlabs.substack.com/p/confident-idiot-problem
63•steerlabs•3d ago

Comments

jqpabc123•3d ago
We are trying to fix probability with more probability. That is a losing game.

Thanks for pointing out the elephant in the room with LLMs.

The basic design is non-deterministic. Trying to extract "facts" or "truth" or "accuracy" is an exercise in futility.

steerlabs•3d ago
Exactly. We treat them like databases, but they are hallucination machines.

My thesis isn't that we can stop the hallucinating (non-determinism), but that we can bound it.

If we wrap the generation in hard assertions (e.g., assert response.price > 0), we turn 'probability' into 'manageable software engineering.' The generation remains probabilistic, but the acceptance criteria becomes binary and deterministic.

jqpabc123•2d ago
but the acceptance criteria becomes binary and deterministic.

Unfortunately, the use-case for AI is often where the acceptance criteria is not easily defined --- a matter of judgment. For example, "Does this patient have cancer?".

In cases where the criteria can be easily and clearly stipulated, AI often isn't really required.

steerlabs•15h ago
You're 100% right. For a "judgment" task like "Does this patient have cancer?", the final acceptance criteria must be a human expert. A purely deterministic verifier is impossible.

My thesis is that even in those "fuzzy" workflows, the agent's process is full of small, deterministic sub-tasks that can and should be verified.

For example, before the AI even attempts to analyze the X-ray for cancer, it must: 1/ Verify it has the correct patient file (PatientIDVerifier). 2/ Verify the image is a chest X-ray and not a brain MRI (ModalityVerifier). 3/ Verify the date of the scan is within the relevant timeframe (DateVerifier).

These are "boring," deterministic checks. But a failure on any one of them makes the final "judgment" output completely useless.

steer isn't designed to automate the final, high-stakes judgment. It's designed to automate the pre-flight checklist, ensuring the agent has the correct, factually grounded information before it even begins the complex reasoning task. It's about reducing the "unforced errors" so the human expert can focus only on the truly hard part.

malfist•34m ago
Why do any of those checks with ai though? All of them you can get a less error prone answer without ai.
jennyholzer•20m ago
Robo-eugenics is the best answer I can come up with
multjoy•27m ago
AI doesn’t necessarily mean an LLM, which are the systems making things up.
scotty79•49m ago
> We treat them like databases, but they are hallucination machines.

Which is kind of crazy because we don't even treat people as databases. Or at least we shouldn't.

Maybe it's one of those things that will disappear form culture one funeral at a time.

hrimfaxi•27m ago
Humans demand more reliability from our creations than from each other.
squidbeak•33m ago
I don't agree that users see them as databases. Sure there are those who expect LLMs to be infallible and punish the technology when it disappoints them, but it seems to me that the overwhelmingly majority quickly learn what AI's shortcomings are, and treat them instead like intelligent entities who will sometimes make mistakes.
philipallstar•32m ago
> but it seems to me that the overwhelmingly majority

The overwhelming majority of what?

Davidzheng•40m ago
lol humans are non-deterministic too
some_furry•20m ago
Human minds are more complicated than a language model that behaves like a stochastic echo.
pixl97•16m ago
Birds are more complicated than jet engines, but jet engines travel a lot faster.
loloquwowndueo•14m ago
They also kill a lot more people when they fail.
rthrfrd•20m ago
But we also have a stake in our society, in the form of a reputation or accountability, that greatly influences our behaviour. So comparing us to an LLM has always been meaningless anyway.
jennyholzer•17m ago
to be fair, the people most antisocially obsessed with dogshit AI software are completely divorced from the social fabric and are not burdened by these sorts of juvenile social ties
fzeindl•25m ago
Bruce Schneier put it well:

"Willison’s insight was that this isn’t just a filtering problem; it’s architectural. There is no privilege separation, and there is no separation between the data and control paths. The very mechanism that makes modern AI powerful - treating all inputs uniformly - is what makes it vulnerable. The security challenges we face today are structural consequences of using AI for everything."

- https://www.schneier.com/crypto-gram/archives/2025/1115.html...

zahlman•24m ago
I can still remember when https://en.wikipedia.org/wiki/Fuzzy_electronics was the marketing buzz.
HarHarVeryFunny•22m ago
The factuality problem with LLMs isn't because they are non-deterministic or statistically based, but simply because they operate at the level of words, not facts. They are language models.

You can't blame an LLM for getting the facts wrong, or hallucinating, when by design they don't even attempt to store facts in the first place. All they store are language statistics, boiling down to "with preceding context X, most statistically likely next words are A, B or C". The LLM wasn't designed to know or care that outputting "B" would represent a lie or hallucination, just that it's a statistically plausible potential next word.

toddmorey•10m ago
Yeah, that’s very well put. They don’t store black-and-white they store billions of grays. This is why tool use for research and grounding has been so transformative.
DoctorOetker•21m ago
Determinism is not the issue. Synonyms exist, there are multiple ways to express the same message.

When numeric models are fit to say scientific measurements, they do quite a good job at modeling the probability distribution. With a corpus of text we are not modeling truths but claims. The corpus contains contradicting claims. Humans have conflicting interests.

Source-aware training (which can't be done as an afterthought LoRA tweak, but needs to be done during base model training AKA pretraining) could enable LLM's to express according to which sources what answers apply. It could provide a review of competing interpretations and opinions, and source every belief, instead of having to rely on tool use / search engines.

None of the base model providers would do it at scale since it would reveal the corpus and result in attribution.

In theory entities like the European Union could mandate that LLM's used for processing government data, or sensitive citizen / corporate data MUST be trained source-aware, which would improve the situation, also making the decisions and reasoning more traceable. This would also ease the discussions and arguments about copyright issues, since it is clear LLM's COULD BE MADE TO ATTRIBUTE THEIR SOURCES.

I also think it would be undesirable to eliminate speculative output, it should just mark it explicitly:

"ACCORDING to <source(s) A(,B,C,..)> this can be explained by ...., ACCORDING to <other school of thought source(s) D,(E,F,...)> it is better explained by ...., however I SUSPECT that ...., since ...."

If it could explicitly separate the schools of thought sourced from the corpus, and also separate its own interpretations and mark them as LLM-speculated-suspicions, then we could still have the traceable references, without losing the potential novel insights LLM's may offer.

jennyholzer•15m ago
"chatGPT, please generate 800 words of absolute bullshit to muddy up this comments section which accurately identifies why LLM technology is completely and totally dead in the water."
DoctorOetker•12m ago
Less than 800 words, but more if you follow the link :)

https://arxiv.org/abs/2404.01019

"Source-Aware Training Enables Knowledge Attribution in Language Models"

sweezyjeezy•15m ago
You could make an LLM deterministic if you really wanted to without a big loss in performance (fix random seeds, make MoE batching deterministic).

I don't think using deterministic / stochastic as the dividing property is useful here if we're talking about a tool to mimic humans. Describing a human coder as 'deterministic' doesn't seem right - if you give one the same tasks under different environmental conditions, I don't think you get exactly the same outputs either. I think that what we're really talking is about some sort of fundamental 'instability' of LLMs a la chaos theory.

pydry•14m ago
I find it amusing that once you try to take LLMs and do productive work with them either this problem trips you up constantly OR the LLM ends up becoming a shallow UI over an existing app (not necessarily better, just different).
steerlabs•3d ago
OP here. I wrote this because I got tired of agents confidently guessing answers when they should have asked for clarification (e.g. guessing "Springfield, IL" instead of asking "Which state?" when asked "weather in Springfield").

I built an open-source library to enforce these logic/safety rules outside the model loop: https://github.com/imtt-dev/steer

condiment•18m ago
This approach kind of reminds me of taking an open-book test. Performing mandatory verification against a ground truth is like taking the test, then going back to your answers and looking up whether they match.

Unlike a student, the LLM never arrives at a sort of epistemic coherence, where they know what they know, how they know it, and how true it's likely to be. So you have to structure every problem into a format where the response can be evaluated against an external source of truth.

amorroxic•18m ago
Thanks a lot for this. Also one question in case anyone could shed a bit of light: my understanding is that setting temperature=0, top_p=1 would cause deterministic output (identical output given identical input). For sure it won’t prevent factually wrong replies/hallucination, only maintains generation consistency (eq. classification tasks). Is this universally correct or is it dependent on model used? (or downright wrong understanding of course?)
stared•41m ago
What I do, is actually running the task. If it is script, getting logs. If it is is website, getting screenshots. Otherwise it is coding in the blind.

Alike writing a script and having the attitude "yeah, I am good at it, I don't need to actually run it to know if works" - well, likely, it won't work. Maybe because of a trivial mistake.

hnthrow0287345•39m ago
>We are trying to fix probability with more probability. That is a losing game.

Technically not, we just don't have it high enough

You're doing exactly what you said you wouldn't though. Betting that network requests are more reliable than an LLM: fixing probability with more probability.

Not saying anything about the code - I didn't look at it - but just wanted to highlight the hypocritical statements which could be fixed.

schmuhblaster•35m ago
This looks like a very pragmatic solution, in line with what seems to be going on in the real world [1], where reliability seems to be one of the biggest issues with agentic systems right now. I've been experimenting with a different approach to increase the amount of determinism in such systems: https://github.com/deepclause/deepclause-desktop. It's based on encoding the entire agent behavior in a small and concise DSL built on top of Prolog. While it's not as flexible as a fully fledged agent, it does however, lead to much more reproducible behavior and a more graceful handling of edge-cases.

[1] https://arxiv.org/abs/2512.04123

kissgyorgy•28m ago
It's just simple validation with some error logging. Should be done the same way as for humans or any other input which goes into your system.

LLM provides inputs to your system like any human would, so you have to validate it. Something like pydantic or Django forms are good for this.

gaigalas•26m ago
I don't think this approach can work.

Anyway, I've written a library in the past (way way before LLMs) that is very similar. It validates stuff and outputs translatable text saying what went wrong.

Someone ported the whole thing (core, DSL and validators) to python a while ago:

https://github.com/gurkin33/respect_validation/

Maybe you can use it. It seems it would save you time by not having to write so many verifiers: just use existing validators.

I would use this sort of thing very differently though (as a component in data synthesis).

raincole•24m ago
> We are trying to fix probability with more probability. That is a losing game.

> The next time the agent runs, that rule is injected into its context. It essentially allows me to “Patch” the model’s behavior without rewriting my prompt templates or redeploying code.

Must be satire, right?

jennyholzer•22m ago
satire is forbidden. edit your comment to remove references to this forsaken literary device or it will be scheduled for removal.
kreijstal•23m ago
The most interesting part of this experiment isn’t just catching the error—it’s fixing it.

When Steer catches a failure (like an agent wrapping JSON in Markdown), it doesn’t just crash.

Say you are using AI slop without saying you are using AI slop.

> It's not X, it's Y.

Kalanos•21m ago
Please refer to this as GenAI
fwip•18m ago
Confident idiot (an LLM) writes an article bemoaning confident idiots.
jennyholzer•11m ago
Confident idiots (commenters, LLMs, commenters with investments in LLMs) write posts bemoaning the article.

Your investment is justified! I promise! There's no way you've made a devastating financial mistake!

toddmorey•13m ago
Confident idiot: I’m exploring using LLM for diagram creation.

I’ve found after about 3 prompts to edit an image with Gemini, it will respond randomly with an entirely new image. Another quirk is it will respond “here’s the image with those edits” with no edits made. It’s like a toaster that will catch on fire every eighth or ninth time.

I am not sure how to mitigate this behavior. I think maybe an LLM as a judge step with vision to evaluate the output before passing it on to the poor user.

nickdothutton•12m ago
- Claude, please optimise the project for performance.

o Claude goes away for 15 minutes, doesn't profile anything, many code changes.

o Announces project now performs much better, saving 70% CPU.

- Claude, test the performance.

o Performance is 1% _slower_ than previous.

- Claude, can I have a refund for the $15 you just wasted?

o [Claude waffles], "no".

klysm•5m ago
I’ve always found the hard numbers on performance improvement hilarious. It’s just mimicking what people say on the internet when they get performance gains
etamponi•12m ago
Aren't we just reinventing programming languages from the ground up?

This is the loop (and honestly, I predicted it way before it started):

1) LLMs can generate code from "natural language" prompts!

2) Oh wait, I actually need to improve my prompt to get LLMs to follow my instructions...

3) Oh wait, no matter how good my prompt is, I need an agent (aka a for loop) that goes through a list of deterministic steps so that it actually follows my instructions...

4) Oh wait, now I need to add deterministic checks (aka, the code that I was actually trying to avoid writing in step 1) so that the LLM follows my instructions...

5) <some time in the future>: I came up with this precise set of keywords that I can feed to the LLM so that it produces the code that I need. Wait a second... I just turned the LLM into a compiler.

The error is believing that "coding" is just accidental complexity. "You don't need a precise specification of the behavior of the computer", this is the assumption that would make LLM agents actually viable. And I cannot believe that there are software engineers that think that coding is accidental complexity. I understand why PMs, CEOs, and other fun people believe this.

Side note: I am not arguing that LLMs/coding agents are nice. T9 was nice, autocomplete is nice. LLMs are very nice! But I am starting to be a bit too fed up to see everyone believing that you can get rid of coding.

blixt•2m ago
Yeah I’ve found that the only way to let AI build any larger amount of useful code and data for a user that does not review all of it requires a lot of “gutter rails”. Not just adding more prompting, because it is an after-the-fact solution. Not just verifying and erroring a turn, because it adds latency and allows the model to start spinning out of control. But also isolating tasks and autofixing output keep the model on track.

Models definitely need less and less of this for each version that comes out but it’s still what you need to do today if you want to be able to trust the output. And even in a future where models approach perfect, I think this approach will be the way to reduce latency and keep tabs on whether your prompts are producing the output you expected on a larger scale. You will also be building good evaluation data for testing alternative approaches, or even fine tuning.

Twelve Days of Shell

https://12days.cmdchallenge.com
131•zoidb•3h ago•38 comments

Pantone Color of the Year 2026: Pantone 11-4201 Cloud Dancer

https://www.pantone.com/color-of-the-year/2026
7•ksec•29m ago•8 comments

The "confident idiot" problem: Why AI needs hard rules, not vibe checks

https://steerlabs.substack.com/p/confident-idiot-problem
65•steerlabs•3d ago•46 comments

Show HN: Web app that lets you send email time capsules

https://resurf.me
22•walrussama•2h ago•16 comments

Turtletoy

https://turtletoy.net/
235•ustad•4d ago•42 comments

Emacs is my new window manager

https://www.howardism.org/Technical/Emacs/new-window-manager.html
134•gpi•2d ago•46 comments

Optimize for momentum

http://muratbuffalo.blogspot.com/2025/12/optimize-for-momentum.html
16•zdw•5d ago•0 comments

Nango (YC W23) is hiring back-end engineers and dev-rels (remote)

https://jobs.ashbyhq.com/Nango
1•bastienbeurier•1h ago

Damn Small Linux

https://www.damnsmalllinux.org/
142•grubbs•11h ago•39 comments

I failed to recreate the 1996 Space Jam website with Claude

https://j0nah.com/i-failed-to-recreate-the-1996-space-jam-website-with-claude/
474•thecr0w•20h ago•382 comments

Bag of words, have mercy on us

https://www.experimental-history.com/p/bag-of-words-have-mercy-on-us
225•ntnbr•15h ago•237 comments

Client-side GPU load balancing with Redis and Lua

https://galileo.ai/blog/how-we-boosted-gpu-utilization-by-40-with-redis-lua
14•lneiman•5d ago•3 comments

Bad Dye Job

https://daringfireball.net/2025/12/bad_dye_job
94•mpweiher•1h ago•34 comments

Show HN: Lockenv – Simple encrypted secrets storage for Git

https://github.com/illarion/lockenv
46•shoemann•6h ago•11 comments

GitHub Actions has a package manager, and it might be the worst

https://nesbitt.io/2025/12/06/github-actions-package-manager.html
209•robin_reala•5h ago•129 comments

Dollar-stores overcharge customers while promising low prices

https://www.theguardian.com/us-news/2025/dec/03/customers-pay-more-rising-dollar-store-costs
419•bookofjoe•23h ago•577 comments

Google Titans architecture, helping AI have long-term memory

https://research.google/blog/titans-miras-helping-ai-have-long-term-memory/
533•Alifatisk•1d ago•173 comments

Show HN: ReadyKit – Superfast SaaS Starter with Multi-Tenant Workspaces

https://readykit.dev/
82•level09•1w ago•22 comments

The C++ standard for the F-35 Fighter Jet [video]

https://www.youtube.com/watch?v=Gv4sDL9Ljww
288•AareyBaba•19h ago•334 comments

The fuck off contact page

https://www.nicchan.me/blog/the-f-off-contact-page/
300•OuterVale•4h ago•123 comments

I wasted years of my life in crypto

https://twitter.com/kenchangh/status/1994854381267947640
349•Anon84•1d ago•501 comments

Mechanical power generation using Earth's ambient radiation

https://www.science.org/doi/10.1126/sciadv.adw6833
143•defrost•15h ago•43 comments

Sperry/Ford Mark-6 Fire Control Computer (2022)

https://www.glennsmuseum.com/items/mk6/
6•pillars•2d ago•0 comments

An Interactive Guide to the Fourier Transform

https://betterexplained.com/articles/an-interactive-guide-to-the-fourier-transform/
223•pykello•6d ago•39 comments

Solving Rush Hour, the Puzzle (2018)

https://www.michaelfogleman.com/rush/
41•xeonmc•1w ago•5 comments

CATL expects oceanic electric ships in three years

https://cleantechnica.com/2025/12/05/catl-expects-oceanic-electric-ships-in-3-years/
135•thelastgallon•1d ago•187 comments

Removing juries: 'A move towards an authoritarian state'

https://www.theguardian.com/law/2025/dec/07/authoritarian-state-trial-by-jury-uk
29•binning•48m ago•3 comments

Jujutsu worktrees are convenient (2024)

https://shaddy.dev/notes/jj-worktrees/
66•nvader•4d ago•45 comments

One too many words on AT&T's $2k Korn shell and other Usenet topics

https://blog.gabornyeki.com/2025-12-usenet/
5•gnyeki•2h ago•0 comments

The Anatomy of a macOS App

https://eclecticlight.co/2025/12/04/the-anatomy-of-a-macos-app/
256•elashri•1d ago•77 comments