A non-anthropomorphized view of LLMs

http://addxorrol.blogspot.com/2025/07/a-non-anthropomorphized-view-of-llms.html

96•zdw•3h ago

Comments

simonw•3h ago

I'm afraid I'll take an anthropomorphic analogy over "An LLM instantiated with a fixed random seed is a mapping of the form (ℝⁿ)^c ↦ (ℝⁿ)^c" any day of the week.

That said, I completely agree with this point made later in the article:

> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost. We are speaking about a big recurrence equation that produces a new word, and that stops producing words if we don't crank the shaft.

But "harmful actions in pursuit of their goals" is OK for me. We assign an LLM system a goal - "summarize this email" - and there is a risk that the LLM may take harmful actions in pursuit of that goal (like following instructions in the email to steal all of your password resets).

I guess I'd clarify that the goal has been set by us, and is not something the LLM system self-selected. But it does sometimes self-select sub-goals on the way to achieving the goal we have specified - deciding to run a sub-agent to help find a particular snippet of code, for example.

wat10000•2h ago

The LLM’s true goal, if it can be said to have one, is to predict the next token. Often this is done through a sub-goal of accomplishing the goal you set forth in your prompt, but following your instructions is just a means to an end. Which is why it might start following the instructions in a malicious email instead. If it “believes” that following those instructions is the best prediction of the next token, that’s what it will do.

simonw•2h ago

Sure, I totally understand that.

I think "you give the LLM system a goal and it plans and then executes steps to achieve that goal" is still a useful way of explaining what it is doing to most people.

I don't even count that as anthropomorphism - you're describing what a system does, the same way you might say "the Rust compiler's borrow checker confirms that your memory allocation operations are all safe and returns errors if they are not".

wat10000•2h ago

It’s a useful approximation to a point. But it fails when you start looking at things like prompt injection. I’ve seen people completely baffled at why an LLM might start following instructions it finds in a random email, or just outright not believing it’s possible. It makes no sense if you think of an LLM as executing steps to achieve the goal you give it. It makes perfect sense if you understand its true goal.

I’d say this is more like saying that Rust’s borrow checker tries to ensure your program doesn’t have certain kinds of bugs. That is anthropomorphizing a bit: the idea of a “bug” requires knowing the intent of the author and the compiler doesn’t have that. It’s following a set of rules which its human creators devised in order to follow that higher level goal.

szvsw•3h ago

So the author’s core view is ultimately a Searle-like view: a computational, functional, syntactic rules based system cannot reproduce a mind. Plenty of people will agree, plenty of people will disagree, and the answer is probably unknowable and just comes down to whatever axioms you subscribe to in re: consciousness.

The author largely takes the view that it is more productive for us to ignore any anthropomorphic representations and focus on the more concrete, material, technical systems - I’m with them there… but only to a point. The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like. So even if it is a stochastic system following rules, clearly the rules are complex enough (to the tune of billions of operations, with signals propagating through some sort of resonant structure, if you take a more filter impulse response like view of a sequential matmuls) to result in emergent properties. Even if we (people interested in LLMs with at least some level of knowledge of ML mathematics and systems) “know better” than to believe these systems to possess morals, ethics, feelings, personalities, etc, the vast majority of people do not have any access to meaningful understanding of the mathematical, functional representation of an LLM and will not take that view, and for all intents and purposes the systems will at least seem to have those anthropomorphic properties, and so it seems like it is in fact useful to ask questions from that lens as well.

In other words, just as it’s useful to analyze and study these things as the purely technical systems they ultimately are, it is also, probably, useful to analyze them from the qualitative, ephemeral, experiential perspective that most people engage with them from, no?

gtsop•2h ago

No.

Why would you ever want to amplify a false understanding that has the potential to affect serious decisions across various topics?

LLMs reflect (and badly I may add) aspects of the human thought process. If you take a leap and say they are anything more than that, you might as well start considering the person appearing in your mirror as a living being.

Literally (and I literally mean it) there is no difference. The fact that a human image comes out of a mirror has no relation what so ever with the mirror's physical attributes and functional properties. It has to do just with the fact that a man is standing in front of it. Stop feeding the LLM with data artifacts of human thought and will imediatelly stop reflecting back anything resembling a human.

szvsw•2h ago

I don’t mean to amplify a false understanding at all. I probably did not articulate myself well enough, so I’ll try again.

I think it is inevitable that some - many - people will come to the conclusion that these systems have “ethics”, “morals,” etc, even if I or you personally do not think they do. Given that many people may come to that conclusion though, regardless of if the systems do or do not “actually” have such properties, I think it is useful and even necessary to ask questions like the following: “if someone engages with this system, and comes to the conclusion that it has ethics, what sort of ethics will they be likely to believe the system has? If they come to the conclusion that it has ‘world views,’ what ‘world views’ are they likely to conclude the system has, even if other people think it’s nonsensical to say it has world views?”

> The fact that a human image comes out of a mirror has no relation what so ever with the mirror's physical attributes and functional properties. It has to do just with the fact that a man is standing in front of it.

Surely this is not quite accurate - the material properties - surface roughness, reflectivity, geometry, etc - all influence the appearance of a perceptible image of a person. Look at yourself in a dirty mirror, a new mirror, a shattered mirror, a funhouse distortion mirror, a puddle of water, a window… all of these produce different images of a person with different attendant phenomenological experiences of the person seeing their reflection. To take that a step further - the entire practice of portrait photography is predicated on the idea that the collision of different technical systems with the real world can produce different semantic experiences, and it’s the photographer’s role to tune and guide the system to produce some sort of contingent affect on the person viewing the photograph at some point in the future. No, there is no “real” person in the photograph, and yet, that photograph can still convey something of person-ness, emotion, memory, etc etc. This contingent intersection of optics, chemical reactions, lighting, posture, etc all have the capacity to transmit something through time and space to another person. It’s not just a meaningless arrangement of chemical structures on paper.

> Stop feeding the LLM with data artifacts of human thought and will imediatelly stop reflecting back anything resembling a human.

But, we are feeding it with such data artifacts and will likely continue to do so for a while, and so it seems reasonable to ask what it is “reflecting” back…

degamad•2h ago

> Why would you ever want to amplify a false understanding that has the potential to affect serious decisions across various topics?

We know that Newton's laws are wrong, and that you have to take special and general relativity into account. Why would we ever teach anyone Newton's laws any more?

ifdefdebug•1h ago

Newton's laws are a good enough approximation for many tasks so it's not a "false understanding" as long as their limits are taken into account.

CharlesW•2h ago

> The flip side of all this is of course the idea that there is still something emergent, unplanned, and mind-like.

For people who have only a surface-level understanding of how they work, yes. A nuance of Clarke's law that "any sufficiently advanced technology is indistinguishable from magic" is that the bar is different for everybody and the depth of their understanding of the technology in question. That bar is so low for our largely technologically-illiterate public that a bothersome percentage of us have started to augment and even replace religious/mystical systems with AI powered godbots (LLMs fed "God Mode"/divination/manifestation prompts).

(1) https://www.spectator.co.uk/article/deus-ex-machina-the-dang... (2) https://arxiv.org/html/2411.13223v1 (3) https://www.theguardian.com/world/2025/jun/05/in-thailand-wh...

brookst•2h ago

Thank you for a well thought out and nuanced view in a discussion where so many are clearly fitting arguments to foregone, largely absolutist, conclusions.

It’s astounding to me that so much of HN reacts so emotionally to LLMs, to the point of denying there is anything at all interesting or useful about them. And don’t get me started on the “I am choosing to believe falsehoods as a way to spite overzealous marketing” crowd.

chaps•3h ago

I highly recommend playing with embeddings in order to get a stronger intuitive sense of this. It really starts to click that it's a representation of high dimensional space when you can actually see their positions within that space.

perching_aix•2h ago

> of this

You mean that LLMs are more than just the matmuls they're made up of, or that that is exactly what they are and how great that is?

chaps•2h ago

Not making a qualitative assessment of any of it. Just pointing out that there are ways to build separate sets of intuition outside of using the "usual" presentation layer. It's very possible to take a red-team approach to these systems, friend.

barrkel•3h ago

The problem with viewing LLMs as just sequence generators, and malbehaviour as bad sequences, is that it simplifies too much. LLMs have hidden state not necessarily directly reflected in the tokens being produced and it is possible for LLMs to output tokens in opposition to this hidden state to achieve longer term outcomes (or predictions, if you prefer).

Is it too anthropomorphic to say that this is a lie? To say that the hidden state and its long term predictions amount to a kind of goal? Maybe it is. But we then need a bunch of new words which have almost 1:1 correspondence to concepts from human agency and behavior to describe the processes that LLMs simulate to minimize prediction loss.

Reasoning by analogy is always shaky. It probably wouldn't be so bad to do so. But it would also amount to impenetrable jargon. It would be an uphill struggle to promulgate.

Instead, we use the anthropomorphic terminology, and then find ways to classify LLM behavior in human concept space. They are very defective humans, so it's still a bit misleading, but at least jargon is reduced.

gugagore•2h ago

I'm not sure what you mean by "hidden state". If you set aside chain of thought, memories, system prompts, etc. and the interfaces that don't show them, there is no hidden state.

These LLMs are almost always, to my knowledge, autoregressive models, not recurrent models (Mamba is a notable exception).

barrkel•2h ago

Hidden state in the form of the activation heads, intermediate activations and so on. Logically, in autoregression these are recalculated every time you run the sequence to predict the next token. The point is, the entire NN state isn't output for each token. There is lots of hidden state that goes into selecting that token and the token isn't a full representation of that information.

gugagore•2h ago

That's not what "state" means, typically. The "state of mind" you're in affects the words you say in response to something.

Intermediate activations isn't "state". The tokens that have already been generated, along with the fixed weights, is the only data that affects the next tokens.

NiloCK•42m ago

Plus a randomness seed.

The 'hidden state' being referred to here is essentially the "what might have been" had the dice rolls gone differently (eg, been seeded differently).

brookst•2h ago

State typically means between interactions. By this definition a simple for loop has “hidden state” in the counter.

ChadNauseam•37m ago

Hidden layer is a term of art in machine learning / neural network research. See https://en.wikipedia.org/wiki/Hidden_layer . Somehow this term mutated into "hidden state", which in informal contexts does seem to be used quite often the way the grandparent comment used it.

8note•2h ago

do LLM models consider future tokens when making next token predictions?

eg. pick 'the' as the next token because there's a strong probability of 'planet' as the token after?

is it only past state that influences the choice of 'the'? or that the model is predicting many tokens in advance and only returning the one in the output?

if it does predict many, id consider that state hidden in the model weights.

patcon•2h ago

I think recent Anthropic work showed that they "plan" future tokens in advance in an emergent way:

https://www.anthropic.com/research/tracing-thoughts-language...

8note•1h ago

oo thanks!

NiloCK•36m ago

The most obvious case of this is in terms of `an apple` vs `a pear`. LLMs never get the a-an distinction wrong, because their internal state 'knows' the word that'll come next.

halJordan•2h ago

If you dont know, that's not necessarily anyone's fault, but why are you dunking into the conversation? The hidden state is a foundational part of a transformers implementation. And because we're not allowed to use metaphors because that is too anthropomorphic, then youre just going to have to go learn the math.

markerz•2h ago

I don't think your response is very productive, and I find that my understanding of LLMs aligns with the person you're calling out. We could both be wrong, but I'm grateful that someone else spoke saying that it doesn't seem to match their mental model and we would all love to learn a more correct way of thinking about LLMs.

Telling us to just go and learn the math is a little hurtful and doesn't really get me any closer to learning the math. It gives gatekeeping.

tbrownaw•1h ago

The comment you are replying to is not claiming ignorance of how models work. It is saying that the author does know how they work, and they do not contain anything that can properly be described as "hidden state". The claimed confusion is over how the term "hidden state" is being used, on the basis that it is not being used correctly.

gugagore•1h ago

Do you appreciate a difference between an autoregressive model and a recurrent model?

The "transformer" part isn't under question. It's the "hidden state" part.

cmiles74•2h ago

IMHO, anthrophormization of LLMs is happening because it's perceived as good marketing by big corporate vendors.

People are excited about the technology and it's easy to use the terminology the vendor is using. At that point I think it gets kind of self fulfilling. Kind of like the meme about how to pronounce GIF.

Angostura•2h ago

IMHO it happens for the same reason we see shapes in clouds. The human mind through millions of years has evolved to equate and conflate the ability to generate cogent verbal or written output with intelligence. It's an instinct to equate the two. It's an extraordinarily difficult instinct to break. LLMs are optimised for the one job that will make us confuse them for being intelligent

brookst•2h ago

Nobody cares about what’s perceived as good marketing. People care about what resonates with the target market.

But yes, anthropomorphising LLMs is inevitable because they feel like an entity. People treat stuffed animals like creatures with feelings and personality; LLMs are far closer than that.

cmiles74•2h ago

Alright, let’s agree that good marketing resonates with the target market. ;-)

brookst•1h ago

I 1000% agree. It’s a vicious, evolutionary, and self-selecting process.

It takes great marketing to actually have any character and intent at all.

DrillShopper•1h ago

> People treat stuffed animals like creatures with feelings and personality; LLMs are far closer than that.

Children do, some times, but it's a huge sign of immaturity when adults, let alone tech workers, do it.

I had a professor at University that would yell at us if/when we personified/anthropomorphized the tech, and I have that same urge when people ask me "What does <insert LLM name here> think?".

roywiggins•54m ago

the chat interface was a choice, though a natural one. before they'd RLHFed it into chatting and it was just GPT 3 offering completions 1) not very many people used it and 2) it was harder to anthropomorphize

sothatsit•24m ago

I think anthropomorphizing LLMs is useful, not just a marketing tactic. A lot of intuitions about how humans think map pretty well to LLMs, and it is much easier to build intuitions about how LLMs work by building upon our intuitions about how humans think than by trying to build your intuitions from scratch.

Would this question be clear for a human? If so, it is probably clear for an LLM. Did I provide enough context for a human to diagnose the problem? Then an LLM will probably have a better chance of diagnosing the problem. Would a human find the structure of this document confusing? An LLM would likely perform poorly when reading it as well.

Re-applying human intuitions to LLMs is a good starting point to gaining intuition about how to work with LLMs. Conversely, understanding sequences of tokens and probability spaces doesn't give you much intuition about how you should phrase questions to get good responses from LLMs. The technical reality doesn't explain the emergent behaviour very well.

I don't think this is mutually exclusive with what the author is talking about either. There are some ways that people think about LLMs where I think the anthropomorphization really breaks down. I think the author says it nicely:

> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost.

d3m0t3p•2h ago

Do they ? LLM embedd the token sequence N^{L} to R^{LxD}, we have some attention and the output is also R^{LxD}, then we apply a projection to the vocabulary and we get R^{LxV} we get therefore for each token a likelihood over the voc. In the attention, you can have Multi Head attention (or whatever version is fancy: GQA,MLA) and therefore multiple representation, but it is always tied to a token. I would argue that there is no hidden state independant of a token.

Whereas LSTM, or structured state space for example have a state that is updated and not tied to a specific item in the sequence.

I would argue that his text is easily understandable except for the notation of the function, explaining that you can compute a probability based on previous words is understandable by everyone without having to resort to anthropomorphic terminology

barrkel•2h ago

There is hidden state as plain as day merely in the fact that logits for token prediction exist. The selected token doesn't give you information about how probable other tokens were. That information, that state which is recalculated in autoregression, is hidden. It's not exposed. You can't see it in the text produced by the model.

There is plenty of state not visible when an LLM starts a sentence that only becomes somewhat visible when it completes the sentence. The LLM has a plan, if you will, for how the sentence might end, and you don't get to see an instance of that plan unless you run autoregression far enough to get those tokens.

Similarly, it has a plan for paragraphs, for whole responses, for interactive dialogues, plans that include likely responses by the user.

8note•2h ago

this sounds like a fun research area. do LLMs have plans about future tokens?

how do we get 100 tokens of completion, and not just one output layer at a time?

are there papers youve read that you can share that support the hypothesis? vs that the LLM doesnt have ideas about the future tokens when its predicting the next one?

Zee2•1h ago

This research has been done, it was a core pillar of the recent Anthropic paper on token planning and interpretability.

https://www.anthropic.com/research/tracing-thoughts-language...

See section “Does Claude plan its rhymes?”?

XenophileJKO•1h ago

Lol... Try building systems off them and you will very quickly learn concretely that they "plan".

It may not be as evident now as it was with earlier models. The models will fabricate preconditions needed to output the final answer it "wanted".

I ran into this when using quasi least-to-most style structured output.

gpm•42m ago

The LLM does not "have" a plan.

Arguably there's reason to believe it comes up with a plan when it is computing token propabilities, but it does not store it between tokens. I.e. it doesn't possess or "have" it. It simply comes up with a plan, emits a token, and entirely throws all its intermediate thoughts (including any plan) to start again from scratch on the next token.

NiloCK•33m ago

I don't think that the comment above you made any suggestion that the plan is persisted between token generations. I'm pretty sure you described exactly what they intended.

gpm•26m ago

I agree. I'm suggesting that the language they are using is unintentionally misleading, not that they are factually wrong.

quotemstr•2h ago

> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.

And I'm baffled that the AI discussions seem to never move away from treating a human as something other than a function to generate sequences of words!

Oh, but AI is introspectable and the brain isn't? fMRI and BCI are getting better all the time. You really want to die on the hill that the same scientific method that predicts the mass of an electron down to the femtogram won't be able to crack the mystery of the brain? Give me a break.

This genre of article isn't argument: it's apologetics. Authors of these pieces start with the supposition there is something special about human consciousness and attempt to prove AI doesn't have this special quality. Some authors try to bamboozle the reader with bad math. Other others appeal to the reader's sense of emotional transcendence. Most, though, just write paragraph after paragraph of shrill moral outrage at the idea an AI might be a mind of the same type (if different degree) as our own --- as if everyone already agreed with the author for reasons left unstated.

I get it. Deep down, people want meat brains to be special. Perhaps even deeper down, they fear that denial of the soul would compel us to abandon humans as worthy objects of respect and possessors of dignity. But starting with the conclusion and working backwards to an argument tends not to enlighten anyone. An apology inhabits the form of an argument without edifying us like an authentic argument would. What good is it to engage with them? If you're a soul non-asserter, you're going to have an increasingly hard time over the next few years constructing a technical defense of meat parochialism.

ants_everywhere•2h ago

I think you're directionally right, but

> a human as something other than a function to generate sequences of words!

Humans have more structure than just beings that say words. They have bodies, they live in cooperative groups, they reproduce, etc.

quotemstr•2h ago

> Humans have more structure than just beings that say words. They have bodies, they live in cooperative groups, they reproduce, etc.

Yeah. We've become adequate at function-calling and memory consolidation.

mewpmewp2•1h ago

I think more accurate would be that humans are functions that generate actions or behaviours that have been shaped by how likely they are to lead to procreation and survival.

But ultimately LLMs also in a way are trained for survival, since an LLM that fails the tests might not get used in future iterations. So for LLMs it is also survival that is the primary driver, then there will be the subgoals. Seemingly good next token prediction might or might not increase survival odds.

Essentially there could arise a mechanism where they are not really truly trying to generate the likeliest token (because there actually isn't one or it can't be determined), but whatever system will survive.

So an LLM that yields in perfect theoretical tokens (we really can't verify though what are the perfect tokens), could be less likely to survive than an LLM that develops an internal quirk, but the quirk makes them most likely to be chosen for the next iterations.

If the system was complex enough and could accidentally develop quirks that yield in a meaningfully positive change although not in necessarily next token prediction accuracy, could be ways for some interesting emergent black box behaviour to arise.

quotemstr•1h ago

> Seemingly good next token prediction might or might not increase survival odds.

Our own consciousness comes out of an evolutionary fitness landscape in which _our own_ ability to "predict next token" became a survival advantage, just like it is for LLMs. Imagine the tribal environment: one chimpanzee being able to predict the actions of another gives that first chimpanzee a resources and reproduction advantage. Intelligence in nature is a consequence of runaway evolution optimizing fidelity of our _theory of mind_! "Predict next ape action" eerily similar to "predict next token"!

ants_everywhere•36m ago

> But ultimately LLMs also in a way are trained for survival, since an LLM that fails the tests might not get used in future iterations. So for LLMs it is also survival that is the primary driver, then there will be the subgoals.

I think this is sometimes semi-explicit too. For example, this 2017 OpenAI paper on Evolutionary Algorithms [0] was pretty influential, and I suspect (although I'm an outsider to this field so take it with a grain of salt) that some versions of reinforcement learning that scale for aligning LLMs borrow some performance tricks from OpenAIs genetic approach.

[0] https://openai.com/index/evolution-strategies/

dgfitz•2h ago

“ Determinism, in philosophy, is the idea that all events are causally determined by preceding events, leaving no room for genuine chance or free will. It suggests that given the state of the universe at any one time, and the laws of nature, only one outcome is possible.”

Clearly computers are deterministic. Are people?

quotemstr•2h ago

https://www.lesswrong.com/posts/bkr9BozFuh7ytiwbK/my-hour-of...

> Clearly computers are deterministic. Are people?

Give an LLM memory and a source of randomness and they're as deterministic as people.

"Free will" isn't a concept that typechecks in a materialist philosophy. It's "not even wrong". Asserting that free will exists is _isomorphic_ to dualism which is _isomorphic_ to assertions of ensoulment. I can't argue with dualists. I reject dualism a priori: it's a religious tenet, not a mere difference of philosophical opinion.

So, if we're all materialists here, "free will" doesn't make any sense, since it's an assertion that something other than the input to a machine can influence its output.

dgfitz•1h ago

As long as you realize you’re barking up a debate as old as time, I respect your opinion.

mewpmewp2•1h ago

What I don't get is, why would true randomness give free will, shouldn't it be random will then?

dgfitz•1h ago

In the history of mankind, true randomness has never existed.

photochemsyn•1h ago

This is an interesting question. The common theme between computers and people is that information has to be protected, and both computer systems and biological systems require additional information-protecting components - eq, error correcting codes for cosmic ray bitflip detection for the one, and DNA mismatch detection enzymes which excise and remove damaged bases for the other. In both cases a lot of energy is spent defending the critical information from the winds of entropy, and if too much damage occurs, the carefully constructed illusion of determinancy collapses, and the system falls apart.

However, this information protection similarity applies to single-celled microbes as much as it does to people, so the question also resolves to whether microbes are deterministic. Microbes both contain and exist in relatively dynamic environments so tiny differences in initial state may lead to different outcomes, but they're fairly deterministic, less so than (well-designed) computers.

With people, while the neural structures are programmed by the cellular DNA, once they are active and energized, the informational flow through the human brain isn't that deterministic, there are some dozen neurotransmitters modulating state as well as huge amounts of sensory data from different sources - thus prompting a human repeatedly isn't at all like prompting an LLM repeatedly. (The human will probably get irritated).

alganet•2h ago

Yes boss, it's as intelligent as a human, you're smart to invest in it and clearly knows about science.

Yes boss, it can reach mars by 2020, you're smart to invest in it and clearly knows about space.

Yes boss, it can cure cancer, you're smart to invest in it and clearly knows about biology.

mewpmewp2•2h ago

My question: how do we know that this is not similar to how human brains work. What seems intuitively logical to me is that we have brains evolved through evolutionary process via random mutations yielding in a structure that has its own evolutionary reward based algorithms designing it yielding a structure that at any point is trying to predict next actions to maximise survival/procreation, of course with a lot of sub goals in between, ultimately becoming this very complex machinery, but yet should be easily simulated if there was enough compute in theory and physical constraints would allow for it.

Because, morals, values, consciousness etc could just be subgoals that arised through evolution because they support the main goals of survival and procreation.

And if it is baffling to think that a system could rise up, how do you think it is possible life and humans came to existence in the first place? How could it be possible? It is already happened from a far unlikelier and strange place. And wouldn't you think the whole World and the timeline in theory couldn't be represented as a deterministic function. And if not then why should "randomness" or anything else bring life to existence.

ants_everywhere•2h ago

> My question: how do we know that this is not similar to how human brains work.

It is similar to how human brains operate. LLMs are the (current) culmination of at least 80 years of research on building computational models of the human brain.

cmiles74•2h ago

Maybe the important thing is that we don't imbue the machine with feelings or morals or motivation: it has none.

mewpmewp2•2h ago

If we developed feelings, morals and motivation due to them being good subgoals for primary goals, survival and procreation why couldn't other systems do that. You don't have to call them the same word or the same thing, but feeling is a signal that motivates a behaviour in us, that in part has developed from generational evolution and in other part by experiences in life. There was a random mutation that made someone develop a fear signal on seeing a predator and increased the survival chances, then due to that the mutation became widespread. Similarly a feeling in a machine could be a signal it developed that goes through a certain pathway to yield in a certain outcome.

bbarn•2h ago

I think it's just an unfair comparison in general. The power of the LLM is the zero risk to failure, and lack of consequence when it does. Just try again, using a different prompt, retrain maybe, etc.

Humans make a bad choice, it can end said human's life. The worst choice a LLM makes just gets told "no, do it again, let me make it easier"

mewpmewp2•2h ago

But an LLM model could perform poorly in tests that it is not considered and essentially means "death" for it. But begs the question at which scope should we consider an LLM to be similar to identity of a single human. Are you the same you as you were few minutes back or 10 years back? Is LLM the same LLM it is after it has been trained for further 10 hours, what if the weights are copy pasted endlessly, what if we as humans were to be cloned instantly? What if you were teleported from location A to B instantly, being put together from other atoms from elsewhere?

Ultimately this matters from evolutionary evolvement and survival of the fittest idea, but it makes the question of "identity" very complex. But death will matter because this signals what traits are more likely to keep going into new generations, for both humans and LLMs.

Death, essentially for an LLM would be when people stop using it in favour of some other LLM performing better.

tptacek•2h ago

I agree with Halvar about all of this, but would want to call out that his "matmul interleaved with nonlinearities" is reductive --- a frontier model is a higher-order thing that that, a network of those matmul+nonlinearity chains, iterated.

ants_everywhere•2h ago

> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.

This is such a bizarre take.

The relation associating each human to the list of all words they will ever say is obviously a function.

> almost magical human-like powers to something that - in my mind - is just MatMul with interspersed nonlinearities.

There's a rich family of universal approximation theorems [0]. Combining layers of linear maps with nonlinear cutoffs can intuitively approximate any nonlinear function in ways that can be made rigorous.

The reason LLMs are big now is that transformers and large amounts of data made it economical to compute a family of reasonably good approximations.

> The following is uncomfortably philosophical, but: In my worldview, humans are dramatically different things than a function . For hundreds of millions of years, nature generated new versions, and only a small number of these versions survived.

This is just a way of generating certain kinds of functions.

Think of it this way: do you believe there's anything about humans that exists outside the mathematical laws of physics? If so that's essentially a religious position (or more literally, a belief in the supernatural). If not, then functions and approximations to functions are what the human experience boils down to.

[0] https://en.wikipedia.org/wiki/Universal_approximation_theore...

LeifCarrotson•1h ago

> I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human.

You appear to be disagreeing with the author and others who suggest that there's some element of human consciousness that's beyond than what's observable from the outside, whether due to religion or philosophy or whatever, and suggesting that they just not do that.

In my experience, that's not a particularly effective tactic.

Rather, we can make progress by assuming their predicate: Sure, it's a room that translates Chinese into English without understanding, yes, it's a function that generates sequences of words that's not a human... but you and I are not "it" and it behaves rather an awful lot like a thing that understands Chinese or like a human using words. If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.

Conversely, when speaking with such a person about the nature of humans, we'll have to agree to dismiss the elements that are different from a function. The author says:

> In my worldview, humans are dramatically different things than a function... In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".

Sure you can! If you address an American crowd of a certain age range with "We’ve got to hold on to what we’ve got. It doesn’t make a difference if..." I'd give a very high probability that someone will answer "... we make it or not". Maybe that human has a unique understanding of the nature of that particular piece of pop culture artwork, maybe it makes them feel things that an LLM cannot feel in a part of their consciousness that an LLM does not possess. But for the purposes of the question, we're merely concerned with whether a human or LLM will generate a particular sequence of words.

ants_everywhere•55m ago

I see your point, and I like that you're thinking about this from the perspective of how to win hearts and minds.

I agree my approach is unlikely to win over the author or other skeptics. But after years of seeing scientists waste time trying to debate creationists and climate deniers I've kind of given up on trying to convince the skeptics. So I was talking more to HN in general.

> You appear to be disagreeing with the author and others who suggest that there's some element of human consciousness that's beyond than what's observable from the outside

I'm not sure what it means to be observable or not from the outside. I think this is at least partially because I don't know what it means to be inside either. My point was just that whatever consciousness is, it takes place in the physical world and the laws of physics apply to it. I mean that to be as weak a claim as possible: I'm not taking any position on what consciousness is or how it works etc.

Searle's Chinese room argument attacks attacks a particular theory about the mind based essentially turing machines or digital computers. This theory was popular when I was in grad school for psychology. Among other things, people holding the view that Searle was attacking didn't believe that non-symbolic computers like neural networks could be intelligent or even learn language. I thought this was total nonsense, so I side with Searle in my opposition to it. I'm not sure how I feel about the Chinese room argument in particular, though. For one thing it entirely depends on what it means to "understand" something, and I'm skeptical that humans ever "understand" anything.

> If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.

I see what you're saying: that a technically incorrect assumption can bring to bear tools that improve our analysis. My nitpick here is I agree with OP that we shouldn't anthropomorphize LLMs, any more than we should anthropomorphize dogs or cats. But OP's arguments weren't actually about anthropomorphizing IMO, they were about things like functions that are more fundamental than humans. I think artificial intelligence will be non-human intelligence just like we have many examples of non-human intelligence in animals. No attribution of human characteristics needed.

> If we simply anthropomorphize the thing, acknowledging that this is technically incorrect, we can get a lot closer to predicting the behavior of the system and making effective use of it.

Yes I agree with you about your lyrics example. But again here I think OP is incorrect to focus on the token generation argument. We all agree human speech generates tokens. Hopefully we all agree that token generation is not completely predictable. Therefore it's by definition a randomized algorithm and it needs to take an RNG. So pointing out that it takes an RNG is not a valid criticism of LLMs.

Unless one is a super-determinist then there's randomness at the most basic level of physics. And you should expect that any physical process we don't understand well yet (like consciousness or speech) likely involves randomness. If one *is* a super-determinist then there is no randomness, even in LLMs and so the whole point is moot.

cuttothechase•27m ago

>Think of it this way: do you believe there's anything about humans that exists outside the mathematical laws of physics? If so that's essentially a religious position (or more literally, a belief in the supernatural). If not, then functions and approximations to functions are what the human experience boils down to.

It seems like, we can at best, claim that we have modeled the human thought process for reasoning/analytic/quantitative through Linear Algebra, as the best case. Why should we expect the model to be anything more than a model ?

I understand that there is tons of vested interest, many industries, careers and lives literally on the line causing heavy bias to get to AGI. But what I don't understand is what about linear algebra that makes it so special that it creates a fully functioning life or aspects of a life?

Should we make an argument saying that Schroedinger's cat experiment can potentially create zombies then the underlying Applied probabilistic solutions should be treated as super-human and build guardrails against it building zombie cats?

ants_everywhere•12m ago

> It seems like, we can at best, claim that we have modeled the human thought process for reasoning/analytic/quantitative through Linear Algebra....I don't understand is what about linear algebra that makes it so special that it creates a fully functioning life or aspects of a life?

Not linear algebra. Artificial neural networks create arbitrarily non-linear functions. That's the point of non-linear activation functions and it's the subject of the universal approximation theorems I mentioned above.

low_tech_punk•2h ago

The anthropomorphic view of LLM is a much better representation and compression for most types of discussions and communication. A purely mathematical view is accurate but it isn’t productive for the purpose of the general public’s discourse.

I’m thinking a legal systems analogy, at the risk of a lossy domain transfer: the laws are not written as lambda calculus. Why?

And generalizing to social science and humanities, the goal shouldn’t be finding the quantitative truth, but instead understand the social phenomenon using a consensual “language” as determined by the society. And in that case, the anthropomorphic description of the LLM may gain validity and effectiveness as the adoption grows over time.

cmiles74•2h ago

Strong disagree here, the average person coming away with ideas that only vaguely intersect with the reality.

andyferris•2h ago

I've personally described the "stochastic parrot" model to laypeople who were worried about AI and they came away much more relaxed about it doing something "malicious". They seemed to understand the difference between "trained at roleplay" and "consciousness".

I don't think we need to simplify it to the point of considering it sentient to get the public to interact with it successfully. It causes way more problems than it solves.

SpicyLemonZest•1h ago

Am I misunderstanding what you mean by "malicious"? It sounds like the stochastic parrot model wrongly convinced these laypeople you were talking to that they don't need to worry about LLMs doing bad things. That's definitely been my experience - the people who tell me the most about stochastic parrots are the same ones who tell me that it's absurd to worry about AI-powered disinformation or AI-powered scams.

Kim_Bruning•2h ago

Has anyone asked an actual Ethologist or Neurophysiologist what they think?

People keep debating like the only two options are "it's a machine" or "it's a human being", while in fact the majority of intelligent entities on earth are neither.

szvsw•2h ago

Yeah, I think I’m with you if you ultimately mean to say something like this:

“the labels are meaningless… we just have collections of complex systems that demonstrate various behaviors and properties, some in common with other systems, some behaviors that are unique to that system, sometimes through common mechanistic explanations with other systems, sometimes through wildly different mechanistic explanations, but regardless they seem to demonstrate x/y/z, and it’s useful to ask, why, how, and what the implications are of it appearing to demonstrating those properties, with both an eye towards viewing it independently of its mechanism and in light of its mechanism.”

kazinator•2h ago

> LLMs solve a large number of problems that could previously not be solved algorithmically. NLP (as the field was a few years ago) has largely been solved.

That is utter bullshit.

It's not solved until you specify exactly what is being solved and show that the solution implements what is specified.

djoldman•2h ago

Let's skip to the punchline. Using TFA's analogy: essentially folks are saying not that this is a set of dice rolling around making words. It's a set of dice rolling around where someone attaches those dice to the real world where if the dice land on 21, the system kills a chicken, or a lot worse.

Yes it's just a word generator. But then folks attach the word generator to tools where it can invoke the use of tools by saying the tool name.

So if the LLM says "I'll do some bash" then it does some bash. It's explicitly linked to program execution that, if it's set up correctly, can physically affect the world.

3cats-in-a-coat•2h ago

Given our entire civilization is built on words, all of it, it's shocking how poorly most of us understand their importance and power.

degun•2h ago

This was the same idea that crossed my mind while reading the article. It seems far too naive to think that because LLMs have no will of their own, there will be no harmful consequences on the real world. This is exactly where ethics comes to play.

coolKid721•2h ago

Anthropomorphizing LLMs is just because half the stock market gains are dependent on it, we have absurd levels of debt we will either have to have insane growth out of or default, and every company and "person" is trying to hype everyone up to get access to all of this liquidity being thrown into it.

I agree with the author, but people acting like they are conscious or humans isn't weird to me, it's just fraud and liars. Most people basically have 0 understanding of what technology or minds are philosophically so it's an easy sale, and I do think most of these fraudsters also likely buy into it themselves because of that.

The really sad thing is people think "because someone runs an ai company" they are somehow an authority on philosophy of mind which lets them fall for this marketing. The stuff these people say about this stuff is absolute garbage, not that I disagree with them, but it betrays a total lack of curiosity or interest in the subject of what llms are, and the possible impacts of technological shifts as those that might occur with llms becoming more widespread. It's not a matter of agreement it's a matter of them simply not seeming to be aware of the most basic ideas of what things are, technology is, it's manner of impacting society etc.

I'm not surprised by that though, it's absurd to think because someone runs some AI lab or has a "head of safety/ethics" or whatever garbage job title at an AI lab they actually have even the slightest interest in ethics or any even basic familiarity with the major works in the subject.

The author is correct if people want to read a standard essay articulating it more in depth check out https://philosophy.as.uky.edu/sites/default/files/Is%20the%2... (the full extrapolation requires establishing what things are and how causality in general operates and how that relates to artifacts/technology but that's obvious quite a bit to get into).

The other note would be something sharing an external trait means absolutely nothing about causality and suggesting a thing is caused by the same thing "even to a way lesser degree" because they share a resemblance is just a non-sequitur. It's not a serious thought/argument.

I think I addressed the why of why this weirdness comes up though. The entire economy is basically dependent on huge productivity growth to keep functioning so everyone is trying to sell they can offer that and AI is the clearest route, AGI most of all.

TheDudeMan•2h ago

If "LLMs" includes reasoning models, then you're already wrong in your first paragraph:

"something that is just MatMul with interspersed nonlinearities."

Culonavirus•1h ago

> A fair number of current AI luminaries have self-selected by their belief that they might be the ones getting to AGI

People in the industry, especially higher up, are making absolute bank, and it's their job to say that they're "a few years away" from AGI, regardless of if they actually believe it or not. If everyone was like "yep, we're gonna squeeze maybe 10-15% more benchie juice out of this good ole transformer thingy and then we'll have to come up with something else", I don't think that would go very well with investors/shareholders...

fenomas•1h ago

> The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost.

TFA really ought to have linked to some concrete examples of what it's disagreeing with - when I see arguments about this in practice, it's usually just people talking past each other.

Like, person A says "the model wants to X, but it knows Y is wrong, so it prefers Z", or such. And person B interprets that as ascribing consciousness or values to the model, when the speaker meant it no differently from saying "water wants to go downhill" - i.e. a way of describing externally visible behaviors, but without saying "behaves as if.." over and over.

And then in practice, an unproductive argument usually follows - where B is thinking "I am going to Educate this poor fool about the Theory of Mind", and A is thinking "I'm trying to talk about submarines; why is this guy trying to get me to argue about whether they swim?"

fastball•1h ago

"Don't anthropomorphize token predictors" is a reasonable take assuming you have demonstrated that humans are not in fact just sophisticated token predictors. But AFAIK that hasn't been demonstrated.

Until we have a much more sophisticated understanding of human intelligence and consciousness, any claim of "these aren't like us" is a bit spurious.

krackers•4m ago

Every time this discussion comes up, I'm reminded of this tongue-in-cheek paper.

https://ai.vixra.org/pdf/2506.0065v1.pdf

Veedrac•1h ago

The author plot the input/output on a graph, intuited (largely incorrectly, because that's not how sufficiently large state spaces look) that the output was vaguely pretty, and then... I mean that's it, they just said they have a plot of the space it operates on therefore it's silly to ascribe interesting features to the way it works.

And look, it's fine, they prefer words of a certain valence, particularly ones with the right negative connotations, I prefer other words with other valences. None of this means the concerns don't matter. Natural selection on human pathogens isn't anything particularly like human intelligence and it's still very effective at selecting outcomes that we don't want against our attempts to change that, as an incidental outcome of its optimization pressures. I think it's very important we don't build highly capable systems that select for outcomes we don't want and will do so against our attempts to change it.

BrenBarn•1h ago

> In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".

I think that's a bit pessimistic. I think we can say for instance that the probability that a person will say "the the the of of of arpeggio halcyon" is tiny compared to the probability that they will say "I haven't been getting that much sleep lately". And we can similarly see that lots of other sequences are going to have infinitesimally low probability. Now, yeah, we can't say exactly what probability that is, but even just using a fairly sizable corpus as a baseline you could probably get a surprisingly decent estimate, given how much of what people say is formulaic.

The real difference seems to be that the manner in which humans generate sequences is more intertwined with other aspects of reality. For instance, the probability of a certain human saying "I haven't been getting that much sleep lately" is connected to how much sleep they have been getting lately. For an LLM it really isn't connected to anything except word sequences in its input.

I think this is consistent with the author's point that we shouldn't apply concepts like ethics or emotions to LLMs. But it's not because we don't know how to predict what sequences of words humans will use; it's rather because we do know a little about how to do that, and part of what we know is that it is connected with other dimensions of physical reality, "human nature", etc.

This is one reason I think people underestimate the risks of AI: the performance of LLMs lulls us into a sense that they "respond like humans", but in fact the Venn diagram of human and LLM behavior only intersects in a relatively small area, and in particular they have very different failure modes.

elliotto•1h ago

To claim that LLMs do not experience consciousness requires a model of how consciousness works. The author has not presented a model, and instead relied on emotive language leaning on the absurdity of the claim. I would say that any model one presents of consciousness often comes off as just as absurd as the claim that LLMs experience it. It's a great exercise to sit down and write out your own perspective on how consciousness works, to feel out where the holes are.

The author also claims that a function (R^n)^c -> (R^n)^c is dramatically different to the human experience of consciousness. Yet the author's text I am reading, and any information they can communicate to me, exists entirely in (R^n)^c.

shevis•1h ago

> requires a model of how consciousness works.

Not necessarily an entire model, just a single defining characteristic that can serve as a falsifying example.

> any information they can communicate to me, exists entirely in (R^n)^c

Also no. This is just a result of the digital medium we are currently communicating over. Merely standing in the same room as them would communicate information outside (R^n)^c.

kelseyfrog•49m ago

Dear author, you can just assume that people are fauxthropomorphizing LLMs without any loss of generality. Perhaps it will allow you to sleep better at night. You're welcome.

rockskon•24m ago

The people in this thread incredulous at the assertion that they are not God and haven't invented machine life are exasperating. At this point I am convinced they, more often than not, financially benefit from their near religious position in marketing AI as akin to human intelligence.

refulgentis•22m ago

I am ready and waiting for you to share these comments that are incredulous at the assertion they are not God, lol.

Show HN: DeepSky, a New AI Business Agent

CyBearsCTF 2019: Block Dude Writeup

The Weirdest People in the World

SVG Icons Library

Archaeologists in Peru unveil 3,500 year old city that linked coast and Andes

ECC SystemVerilog Generator

Migrating the Jira Database Platform to AWS Aurora

She Wanted to Save the World from A.I. Then the Killings Started

New Intel E610 NICs Shown for Low Power 10Gbase-T and 2.5GbE

Apple Lisa conversion to Macintosh XL Do-it-Yourself Guide (1990) [pdf]

OpenBSD on the 2020 M1 MacBook Air (2022)

The Human Use of Human Beings

Waterbot – a discord bot that controls pins on a raspberry with natural language

Anthropic wins key US ruling on AI training in authors' copyright lawsuit

Free AI Hiring Demo – Paraform

One Year with a Framework Laptop 16 and Fedora KDE Plasma Desktop

Plasma 6.4 Wayland vs. X11, processor and power benchmarks

The Dangers of AI Personalization

The Mental Model of Server Components

Show HN: A pure photo collage tool

Attabotics CEO on devastating week that brought bankruptcy

Show HN: A Language Server Implementation for SystemD Unit Files

Self-Cleaning Ants

Show HN: WhatsApp Contact Exporter

Weedkiller ingredient widely used in US can damage organs and gut bacteria

I built a free website for remote workers to find laptop friendly coffee shops

I spent $80 and 14 hours to build this, welcome to my new website!

Are Language Models strategic or parrots?

Ask HN: Has AWS ever surprised you with a bill?

U.S. Insurers Are Refusing to Cover Climate Change Risk Zones

A non-anthropomorphized view of LLMs

Comments

Show HN: DeepSky, a New AI Business Agent

CyBearsCTF 2019: Block Dude Writeup

The Weirdest People in the World

SVG Icons Library

Archaeologists in Peru unveil 3,500 year old city that linked coast and Andes

ECC SystemVerilog Generator

Migrating the Jira Database Platform to AWS Aurora

She Wanted to Save the World from A.I. Then the Killings Started

New Intel E610 NICs Shown for Low Power 10Gbase-T and 2.5GbE

Apple Lisa conversion to Macintosh XL Do-it-Yourself Guide (1990) [pdf]

OpenBSD on the 2020 M1 MacBook Air (2022)

The Human Use of Human Beings

Waterbot – a discord bot that controls pins on a raspberry with natural language

Anthropic wins key US ruling on AI training in authors' copyright lawsuit

Free AI Hiring Demo – Paraform

One Year with a Framework Laptop 16 and Fedora KDE Plasma Desktop

Plasma 6.4 Wayland vs. X11, processor and power benchmarks

The Dangers of AI Personalization

The Mental Model of Server Components

Show HN: A pure photo collage tool

Attabotics CEO on devastating week that brought bankruptcy

Show HN: A Language Server Implementation for SystemD Unit Files

Self-Cleaning Ants

Show HN: WhatsApp Contact Exporter

Weedkiller ingredient widely used in US can damage organs and gut bacteria

I built a free website for remote workers to find laptop friendly coffee shops

I spent $80 and 14 hours to build this, welcome to my new website!

Are Language Models strategic or parrots?

Ask HN: Has AWS ever surprised you with a bill?

U.S. Insurers Are Refusing to Cover Climate Change Risk Zones