The first line "Why do language models sometimes just make things up?" was not what I was expecting to read about.
Regardless of whether those terms in the AI context correlate perfectly to their original meanings.
I get that this is different from getting an LLM to admit that it doesn’t know something, but I thought “getting a coding agent to stop spinning its wheels when set to an impossible task” was months or years away, and then suddenly it was here.
I haven’t yet read a good explanation of why Claude 4 is so much better at this kind of thing, and it definitely goes against what most people say about how LLMs are supposed to work (which is a large part of why I’ve been telling people to stop leaning on mechanical explanations of LLM behavior/strengths/weaknesses). However it was definitely a step-function improvement.
Ask them to solve one of the Millennium Prize Problems. They’ll say they can’t do it, but that 'No' is just memorized. There’s nothing behind it.
> Unfortunately, the term hallucination quickly stuck to this phenomenon — before any psychologist could object.
The only difference between the two is whether a human likes it. If the human doesn't like it, then it's a hallucination. If the human doesn't know it's wrong, then it's not a hallucination (as far as that user is concerned).
The term "hallucination" is just marketing BS. In any other case it'd be called "broken shit".
The term hallucination is used as if the network is somehow giving the wrong output. It's not. It's giving a probability distribution for the next token. Exactly what it was designed for. The misunderstanding is what the user thinks they are asking. They think they are asking for a correct answer, but they are instead asking for a plausible answer. Very different things. An LLM is designed to give plausible, not correct answers. And when a user asks for a plausible, but not necessarily correct, answer (whether or not they realize it) and they get a plausible but not necessarily correct answer, then the LLM is working exactly as intended.
baquero•4mo ago