The title/introduction is very baited, because it implies some "physical" connection to hallucinations in biological organism, but it's focused on trying to single out certain parts of the model. LLMs are absolutely nothing at all like a biological system, of which our brains are orders of magnitudes more complex than the machines we've built that we no longer fully understand. Believing in these LLMs as being some next stage in understanding intelligence is hubris.
Who cares? I wonder if any of the commenters is qualified enough to understand the research at all. I am not.
You're just arguing about semantics. It doesn't matter in any substantial way. Ultimately, we need a word to distinguish factual output from confidently asserted erroneous output. We use the word "hallucinate". If we used a different word, it wouldn't make any difference -- the observable difference remains the same. "Hallucinate" is the word that has emerged, and it is now by overwhelming consensus the correct word.
> Whenever they get something "right," it's literally by accident.
This is obviously false. A great deal of training goes into making sure they usually get things right. If an infinite number of monkeys on typewriters get something right, that's by accident. Not LLM's.
While I agree for many general aspects of LLMs, I do disagree in terms of some of the meta-terms used when describing LLM behavior. For example, the idea that AI has "bias" is problematic because neural networks literally have a variable called "bias", thus of course AI will always have "bias". Plus, a biases AI is literally the purpose behind classification algorithms.
But these terms, "bias" and "hallucinations", are co-opted to spin a narrative of no longer trusting AI.
How in the world did creating an overly confident chatbot completely 180 years of AI progress and sentiment?
It's called hallucination because it works by imagining you have the solution and then learning what the input needs to be to get that solution. Treat the input or the output as weights and learn an input that fits an output or vice-versa instead of the network. Fix what the network sees as the "real world" to match what "what you already knew", just like a hallucinating human does.
You can imagine how hard it is to find papers on this technique nowadays.
Hallucinations are already associated with a type of behavior, which is (roughly defined) "subjectively seeing/hearing things which aren't there". This is an input-level error, not the right umbrella term for the majority of errors happening with LLMs, many if which are at output-level.
I don't know what would be a better term, but we should distinguish between different semantic errors, such as:
- confabulating, i.e., recalling distorted or misinterpreted memories;
- lying, i.e., intentionally misrepresenting an event or memory;
- bullshitting, i.e., presenting a version without regard for the truth or provenance; etc.
I'm sure someone already made a better taxonomy, and hallucination is OK for normal public discussions, but I'm not sure why the distinctions aren't made in supposedly more serious works.
And I think we already distinguish between types of errors -- LLM's effectively don't lie, AFAIK, unless you're asking them to engage in role-play or something. They mostly either hallucinate/confabulate in terms of inventing knowledge they don't have, or they just make "mistakes" e.g. in arithmetic, or in attempting to copy large amounts of code verbatim.
And when you're interested in mistakes, you're generally interested in a specific category of mistakes, like arithmetic, or logic, or copying mistakes, and we refer to them as such -- arithmetic errors, logic errors, etc.
So I don't think hallucination is taking away from any kind of specificity. To the contrary, it is providing specificity, because we don't call arithmetic errors hallucinations. And we use the word hallucination precisely to distinguish it from these run-of-the-mill mistakes.
[1] https://trends.google.com/explore?q=hallucination&date=all&g...
What is indisputable is that LLMs, even though they are 'just' word generators, are remarkably good at generating factual statements and accurate answers to problems, yet also regrettably inclined to generating apparenly equally confident counterfactual statements and bogus answers. That's all that 'hallucination' means in this context.
If this work can be replicated, it may offer a way to greatly improve the signal-to-bullshit ratio of LLMs, and that will be both impressive and very useful if true.
"Whenever they get something "right," it's literally by accident." "the random word generator"
First of, the input is not random at all which allows the question how random the output is.
Second, it compresses data which has an impact on that data. Probably cleaning or adjustment which should reduce 'random' even more. It compresses data from us into concepts. A high level concept is more robust than 'random'.
Thinking or reasoning models are also finetuning the response by walking the hyperspace and basically collecting and strengthening data.
We as humans do very similiar things and no one is calling us just random word predictors...
And because of this, "hallucinations -- plausible but factually incorrect outputs" is an absolut accurate description of what an LLM does when it response with a low probability output.
Humans also do this often enough btw.
Please stop saying an LLM is just a random word predictor.
Obviously "hallucinate" and "lie" are metaphors. Get over it. These are still emergent structures that we have a lot to learn from by studying. But I suppose any attempt by researchers to do so should be disregarded because Person On The Internet has watched the 3blue1brown series on Neural Nets and knows better. We know the basic laws of physics, but spend lifetimes studying their emergent behaviors. This is really no different.
The "emergent structures" you are mentioning are just the outcome of randomness guided by "gradiently" descending to data landscapes. There is nothing to learn by studying these frankemonsters. All these experiments have been conducted in the past (decades past) multiple times but not at this scale.
We are still missing basic theorems, not stupid papers about which tech bro payed the highest electricity bill to "train" on extremely inefficient gaming hardware.
Biological systems are hard.
I'm extremely comfortable calling this paper complete and utter bullshit (or, I suppose if I'm being charitable, extremely poorly titled) from the title alone.
This type of research is absolut valid.
An LLM is not just hallucinate.
I recently almost fell on a tram as it accelerated suddenly; my arm reached out for a stanchion that was out of my vision, so rapidly I wasn't aware of what I was doing before it had happened. All of this occurred using subconscious processes, based on a non-physical internal mental model of something I literally couldn't see at the moment it happened. Consciousness is over-rated; I believe Thomas Metzinger's work on consciousness (specifically, the illusion of consciousness) captures something really important about the nature of how our minds really work.
Table 1 is even more odd, H-neurons predicts hallucination ~75% of the time but a similar % of random neurons predict hallucinations ~60% of the time, which doesn't seem like a huge difference to me.
No. Human beings have experiential, embodied, temporal knowledge of the world through our senses. That is why we can, say, empirically know something, which is vastly different than semantically or logically knowing something. Yes, human beings also have probabalistic ways of understanding the world and interacting with others. We have many other forms of knowledge as well and the LLM way of interpreting data is by no means the primary way in which we feel confident that something is true or false.
That said, I don't get up in arms about the term "hallucination", although I prefer the term confabulation per neuroscientist Anil Seth. Many clunky metaphors are now mainstream, and as long as the engineers and researchers who study these kinds of things are ok with that, that's the most important thing.
But what I think all these people who dismiss objections to the term as "arguing semantics" are missing is the fundamental point: LLMs have no intent, and they have no way of distinguishing what data is empirically true or not. This is why the framing, not just the semantics, of this piece is flawed. "Hallucinations" is a feature of LLMs that exists at the very conceptual level, not as a design flaw of current models. They have pattern recognition, which gets us very far in terms of knowing things, but people who only rely on such methods of knowing are most often referred to as conspiracy theorists.
bowsamic•2h ago
Miraltar•2h ago
H-Neurons: On the Existence, Impact, and Origin of Hallucination-Associated Neurons in LLMs
ZeroConcerns•2h ago
But regardless of title this is all highly dubious...