That does not mean we can't have a technical discussion that bypasses at least some of those considerations.
Look at the data exfiltration attacks e.g. https://simonwillison.net/2025/Aug/9/bay-area-ai/
Or the parallel comment about a coding llm deleting a database.
Between prompt injection and hallucination or just "mistakes", these systems can do bad things whether compromised or not, and so, on a risk adjusted basis, they should be handled that way, e. g with human in the loop, output sanitization, etc.
Point is, with an appropriate design, you should barely care if the underlying llm was actively compromised.
what that means is that you cannot trust a human in the loop to somehow make it safe. it was also not safe with only humans.
The key difference is that LLMs are fast, relentless - humans are slow and get tired - humans have friction, and friction means slower to generate errors too.
once you embrace these differences its a lot easier yo understand where and how LLM should be used.
This argument is everywhere and is frustrating to debate. If it were true, we’d quickly find ourselves in absurd territory:
> If I can go to a restaurant and order food without showing ID, there should be an unprotected HTTP endpoint to place an order without auth.
> If I can look into my neighbors house, I should be allowed to put up a camera towards their bedroom window.
Or, the more popular one today:
> A human can listen to music without paying royalties, therefore an AI company is allowed to ingest all music in the world and use the result for commercial gain.
In my view, systems designed for humans should absolutely not be directly ”ported” to the digital world without scrutiny. Doing so ultimately means human concerns can be dismissed. Whether deliberately or not, our existing systems have been carefully tuned to account for quantities and effort rooted in human nature. It’s very rarely tuned to handle rates, fidelity and scale that can be cheaply achieved by machines.
Generally, when people talk about wanting a human in the loop, it’s not with the expectation that humans have achieved perfection. I would make the argument that most people _are_ experts at their specific job or at least have a more nuanced understanding of what correct looks like.
Having a human in the loop is important because LLMs can make absolutely egregious mistakes, and cannot be “held responsible“. Of course humans can also make egregious mistakes, but we can be held responsible, and improve for next time.
The reason we don’t fire developers for accidentally taking down prod is precisely because they can learn, and not make that specific mistake again. LLMs do not have that capability.
Even if the average error-rate was the same (which is hardly safe to assume), there are other reasons not to assume equivalence:
1. The shape and distribution of the errors may be very different in ways which make the risk/impact worse.
2. Our institutional/system tools for detecting and recovering from errors are not the same.
3. Human errors are often things other humans can anticipate or simulate, and are accustomed to doing so.
> friction
Which would be one more item:
4. An X% error rate at a volume limited by human action may be acceptable, while an X% error rate at a much higher volume could be exponentially more damaging.
_____________
"A computer lets you make more mistakes faster than any other invention with the possible exceptions of handguns and Tequila." --Mitch Ratcliffe
Maybe as gains in LLM performance become smaller and smaller, companies will resort to trying to poison the pre-training dataset of competitors to degrade performance, especially on certain benchmarks. This would be a pretty fascinating arms race to observe.
TehCorwiz•6h ago
danielbln•6h ago
jonplackett•6h ago
kangs•4h ago
btown•3h ago
taps-head-meme