LLM Problems Observed in Humans

https://embd.cc/llm-problems-observed-in-humans

154•js216•1mo ago

Comments

systemerror•1mo ago

I cannot tell if this is satire, but if it is, bravo.

pixl97•1mo ago

Why would this be satire? I'm guessing you don't pay that much attention to humans, or you have a very curated social group you're around.

hmokiguess•1mo ago

I think the argument that could be made here is that, given evolution, one could say "feature, not a bug." in the context of humans.

pixl97•1mo ago

>given evolution, one could say "feature, not a bug."

The issue with evolution is huge portions of it just happen to exist and not kill the host before they breed. It could be a massive bug that if corrected could cause the host to breed and spread their genes far further, but evolution itself can't reach there.

systemerror•1mo ago

I really can't tell to be honest, so if it's satire it's very good. If it's not then I don't really know what it's saying, humans are bad at stuff too? Many humans are not well educated and are not great conversationalists? Does this somehow make LLMs better in our perspective somehow?

BugsJustFindMe•1mo ago

If it were satire, what do you think it would be satirizing?

> I don't really know what it's saying

It's saying that complaints about deficiencies in LLMs, about a fundamental lack of LLM intelligence, about how LLMs are just statistical machines and not really thinking, about how LLMs are incapable of learning from past experiences, about how LLMs lack any coherent epistemology ignore how very deficient humans are in many same exact ways.

> Does this somehow make LLMs better in our perspective somehow?

Better is a relative measure not an absolute one, so possibly, because views of LLMs are inherently formed in relation to views of the human brains they're modeling.

tart-lemonade•1mo ago

> If it were satire, what do you think it would be satirizing?

Think of the most terminally online drama you've ever witnessed: the hysterics people work themselves into over what (to outside observers) seems utterly inane and forgettable, the multi-page Tumblr or 4chan posts that become the sacred texts of the "discourse", and the outsized importance people ascribe to it, as if some meme, album cover, or Qanon drop is the modern incantation of the shot heard around the world.

The people wrapped up in this stuff tend to self-select into their own communities because if you're not involved with or amenable to caring about it, why should they spend time talking to someone who will just nod, go "huh, that's wild", and proceed to steer the conversation elsewhere? In their eyes, you may even be a weirdo for not caring about this stuff.

So when I read:

> I’ve got a lot of interests and on any given day, I may be excited to discuss various topics, from kernels to music to cultures and religions. I know I can put together a prompt to give any of today’s leading models and am essentially guaranteed a fresh perspective on the topic of interest. But let me pose the same prompt to people and more often then not the reply will be a polite nod accompanied by clear signs of their thinking something else entirely, or maybe just a summary of the prompt itself, or vague general statements about how things should be. In fact, so rare it is to find someone who knows what I mean that it feels like a magic moment. With the proliferation of genuinely good models—well educated, as it were—finding a conversational partner with a good foundation of shared knowledge has become trivial with AI. This does not bode well for my interest in meeting new people.

I'm imagining the more academic equivalent of someone who got wrapped up in Tiktok drama or Q nuttery but couldn't find a community of kindred souls and, frustrated with the perceived intellectual mediocrity surrounding themself, has embraced LLMs for connection instead. And that's just hilarious. If Silicon Valley was still being produced, I'm sure this would have been made into an episode at some point.

The bits about not generalizing and engaging in fallacious reasoning are also quite amusing since, while yes, the average person likely would benefit from taking (and paying attention in) a couple introductory philosophy classes, expecting all humans to behave logically and introspectively is fantastical thinking.

BugsJustFindMe•4w ago

> expecting all humans to behave logically and introspectively is fantastical thinking

Yes, that is exactly the point of OP's post, that humans are on average quite bad at behaving logically and introspectively and exhibit the very same behaviors that we righteously fault AI for doing. And then OP provides a list of faulty human behaviors that are the same faulty behaviors people give as demonstrating that AI lacks true intelligence.

Meanwhile, AI continues to improve and the human species does not.

And the conclusion is that the fact that the rise of AI has made human faulty behaviors more apparent may creepingly tear at the social fabric.

Read the first paragraph again. It sets the framing through which the rest of the post is understood (as first paragraphs tend to do).

I find this exchange to be a funny example of the truth of OP's list, where the part which sticks with you is some finer detail of one of the examples while the thesis statement itself, the very explanation of the overarching point of the post, seems to have fallen outside of the context window.

staticman2•1mo ago

>about how LLMs are just statistical machines and not really thinking,

I don't think the article said anything about statistics?

This seems to be a sort of Rorchasch test but looking at it again:

>This does not bode well for my interest in meeting new people

It really does seem to me the article is making fun of people who think this sort of article is on point.

There's a genre of satire where the joke is that it makes you ask "Who the heck is the sort of person who would write this?"

It could fit in that genre but of course I could be wrong.

BugsJustFindMe•4w ago

> I don't think the article said anything about statistics?

I don't think I said or implied that it did. It's merely one of the many positions that people commonly (and defensively) take for why LLMs aren't and/or can't be intelligent like humans, ignoring that humans exhibit exactly the same patterns.

empath75•1mo ago

The point is that if you have benchmarks for intelligence, which humans would also fail, then you have to concede that either humans are not intelligent, or that the benchmarks are too strict, or aren't a measure for intelligence at all.

allears•1mo ago

The thing is, LLMs would fail that test every time, but humans would pass it most of the time (hopefully). Just because humans are fallible doesn't make LLMs intelligent.

We really haven't got a grip on what intelligence actually is, but it seems that humans and LLMs aren't really in the same ballpark, or even the same league.

pixl97•1mo ago

>haven't got a grip on what intelligence actually is

Because intelligence isn't a thing, it's a bunch of different things that some intelligent things have more or less (or none of).

This is why measures of intelligence always fail because we try to binary it which doesn't work. Intelligence is spikey. Intelligence scales from very small and dumb to very smart. But even the things that are very smart on a lot of things still do very dumb things. We also measure human intelligence as a function of all humans and LLM intelligence on a particular model.

So yea, this is why nothing seems to make sense.

staticman2•1mo ago

You ask why this would be satire?

Well let's take a look at this:

>The best thing about a good deep conversation is when the other person gets you: you explain a complicated situation you find yourself in, and find some resonance in their replies.

>That, at least, is what happens when chatting with the recent large models.

The first sentence says a good conversation is between two people. The author then pulls the rug out and says "Psych. A good conversation is when I use LLMs."

The author points out humans have decades of memories but is surprised that when they tell someone they are wrong they don't immediately agree and sycophantically mirror the author's point of view.

The author thinks it's weird they don't know when the next eclipse is. They should know this info intuitively.

The author claims humans have a habit of being wrong even in issues of religion but models have no such flaw. If only humans embraced evidence based religious opinions like LLMs.

The author wonders why they bothered writing this article instead of asking ChatGPT to write it.

Did you ask an LLM if this is satire?

I did and Opus said it wasn't satire.

This was clearly a hallucination so I informed it it was incorrect and it changed it's opinion to agree with me so clearly I known what I'm talking about.

I'll spare you the entire output but among other things after I corrected it it said:

The "repeating the same mistakes" section is even better once you see it. The complaint is essentially: "I told someone they were wrong, and they didn't immediately capitulate. Surely pointing out their error should rewire their brain instantly?" The author presents this as a human deficiency rather than recognizing that disagreement isn't a bug.

hmokiguess•1mo ago

You're absolutely right! (forgot that one)

fsflover•1mo ago

> reward hacking, where social approval or harmony is optimized at the expense of truth or usefulness

DenisM•1mo ago

Blending is socially is very useful for the individual, i don’t agree with the author here.

Society itself may benefit from cohesion or from truth depending on circumstances.

chankstein38•1mo ago

While I haven't experienced LLMs correcting most (or any) of the problems listed fully and consistently, I do agree that consistent use of LLMs and dealing with their frustrations has worn my patience for conversations with people who exhibit the same issues when talking.

It's kind of depressing. I just want the LLM to be a bot that responds to what I say with a useful response. However, for some reason, both Gemini and ChatGPT tend to argue with me so heavily and inject their own weird stupid ideas on things making it even more grating to interact with them which chews away at my normal interpersonal patience which, as someone on the spectrum, was already limited.

acedTrex•1mo ago

This is why i simply do not bother with them unless the task i need is so specific that theres no room for argument, like yesterday i asked it to generate me a bash script that ran aws ssm commands for all the following instance IDs. It did that as a two shot.

But long conversations are never worth it.

empath75•1mo ago

> ChatGPT tend to argue with me so heavily

I have found that quite often when ChatGPT digs in on something, that it is in fact right, and I was the one that was wrong. Not always, maybe not even most of the time, but enough that it does give me pause and make me double check.

Also, when you have an LLM that is too agreeable, that is how it gets into a folie a deux situation and starts participating in user's delusions, with disastrous outcomes.

dns_snek•1mo ago

> Also, when you have an LLM that is too agreeable...

It's not a question of whether an LLM should be agreeable or argumentative. It should aim to be correct - it should be agreeable about subjective details and matters of taste, it should be argumentative when the user is wrong about a matter of fact or made an error, and it should be inquisitive and capable of actually re-evaluating a stance in a coherent and logically sound manner when challenged by the user instead of either "digging in" or just blindly agreeing.

empath75•1mo ago

> it should be agreeable about subjective details and matters of taste,

What if your subjective opinion is that you think life isn't worth living, how should an LLM respond to that.

dns_snek•1mo ago

That's philosophy and mental health, I was talking about technical or other "work" topics.

But to answer the question, it depends on the framing - if someone starts the chat by saying that they feel like life isn't worth living then the LLM should probably suggest reaching out to local mental health services and either stop the conversation or play a role in "listening" to them. It shouldn't judge, encourage, or agree necessarily. But it would probably be best to cut the conversation unless there's a really high level of confidence that the system won't cause harm.

topaz0•1mo ago

This is what people should want from an intellectual slave, sure, and I don't think it's going to happen for llms.

ACCount37•1mo ago

Which is not an easy thing to tune for.

So much easier to just make it agree all the time or disagree all the time. And trying to bottle the lightning often just causes degeneracy when you fail.

crazygringo•1mo ago

This is my experience too. About 2/3 of the time my question/prompt contained ambiguity and it interpreted it differently (but validly), so it's just about misunderstanding, but maybe 1/3 of the time I'm surprised to discover something I didn't know. I double-check it on Wikipedia and a couple of other places and learn something new.

bicepjai•1mo ago

>>> … However, for some reason, both Gemini and ChatGPT tend to argue with me so heavily and inject their own weird stupid ideas on things …

This is something I have not experienced. Can you provide examples ?

agloe_dreams•1mo ago

Yeah this is exactly opposite my issue with LLMs. They often take what you say as the truth when it absolutely could not be.

rguzman•1mo ago

> However, for some reason, both Gemini and ChatGPT tend to argue with me so heavily and inject their own weird stupid ideas on things

do you have examples of this?

asking because this is not what happens to me. one of the main things i worry about when interacting with the llm is that they agree with me too easily.

mikasisiki•1mo ago

There was a period when coding agents would always agree with you, even if you gave them a really bad idea. They’d always start with something like, “You’re right — I should…”.

Back then, what we actually wanted was for them to push back and argue with us.

GuB-42•1mo ago

I have taken the stance to not argue with LLMs, don't give them any clues, and don't ask them to roleplay. Tell them no more than what they need to know.

And if they get the answer wrong, don't try to correct them or guide them, there is a high chance they don't have the answer and what follow will be hallucinations. You can ask for details, but don't try to go against it, it will just assume you are right (even if you are not) and hallucinate around that. Keep what you already know to yourself.

As for the "you are an expert" prompts, it will mostly just make the LLM speak more authoritatively, but it doesn't mean it will be more correct. My strategy is now to give the LLM as much freedom as it can get, it may not be the best way to extract all the knowledge it has, but it helps spot hallucinations.

You can argue with actual people, if both of you are open enough, something greater make come out of it, but if not, it is useless, and with LLMs it is always useless, they are pretrained, they won't get better in the future because that little conversation sparked their interest. And on your side, you will just have your own points rephrased and sent back to you, and that will just put you deeper in your own bubble.

eru•3w ago

What's the purpose of your stance? What are you trying to achieve?

okwhateverdude•1mo ago

> However, for some reason, both Gemini and ChatGPT tend to argue with me

The trick here is: "Be succinct. No commentary."

And sometimes a healthy dose of expressing frustration or anger (cursing, berating, threatening) also gets them to STFU and do the thing. As in literally: "I don't give a fuck about your stupid fucking opinions on the matter. Do it exactly as I specified"

Also generally the very first time it expresses any of that weird shit, your context is toast. So even correcting it is reinforcing. Just regenerate the response.

CamperBob2•1mo ago

Last time I bawled out an LLM and forced it to change its mind, I later realized that the LLM was right the first time.

One of those "Who am I and how did I end up in this hole in the ground, and where did all these carrots and brightly-colored eggs come from?" moments, of the sort that seem to be coming more and more frequently lately.

Aerbil313•1mo ago

Yeah, same. Lately almost every time I think "Oh no way, this is not the correct way/not the optimal way/it's a hallucination" it later turns out that it's actually the correct way/the optimal way/it's not a hallucination. I now think twice before doing anything differently than what the LLM tells me unless I'm an expert on the subject and can already spot mistakes easily.

It seems like they really figured out grounding and the like in the last couple of months.

eru•3w ago

I wouldn't worry too much about these false negatives: your human friends might be cross if you constantly accuse them of being wrong when they are actually right, but the LLMs are too polite to hold a grudge.

mrweasel•1mo ago

An absolute enjoyable read. It also raises a good point, regarding the Turing test. I have a family member who teaches adults and as she pointed out: You won't believe how stupid some people are.

As critical as I might be of LLMs, I fear that they already outpaced a good portion of the population "intellectually". There's a lower level, which modern LLMs won't cross, in terms of lack of general knowledge or outright stupidity.

We may have reached a point where we can tell that we're talking to a human, because there's no way a computer would lack such basic knowledge or display similar levels of helplessness.

skybrian•1mo ago

Whenever we're testing LLM's against people we need to ask "which people?" Testing a chess bot against random undergrads versus chess grandmasters tells us different things.

From an economics perspective, maybe a relevant comparison is to people who do that task professionally.

TeodorDyakov•1mo ago

Gdpeval

voxleone•1mo ago

I sometimes feel a peculiar resonance with these models: they catch the faintest hints of irony and return astoundingly witty remarks, almost as if they were another version of myself. Yet all of the problems, inconsistencies, and surprises that arise in human thought stem from something profoundly differen, which is our embodied experience of the world. Humans integrate sensory feedback, form goals, navigate uncertainty, and make countless micro-decisions in real time, all while reasoning causally and contextually. Cognition is active, multimodal, and adaptive; it is not merely a reflection of prior experience a continual construction of understanding.

And then there are some brilliant friends of mine, people with whom a conversation can unfold for days, rewarding me with the same rapid, incisive exchange we now associate with language models. There is, clearly, an intellectual and environmental element to it.

leonidasv•1mo ago

> While some are still discussing why computers will never be able to pass the Turing test

Are people still debating that? I thought it was settled by the time GPT-4 came out.

gcuvyvtvv6•1mo ago

You're absolutely right.

rcruzeiro•1mo ago

You are right to push back on that.

omnicognate•1mo ago

There is no the Turing test. I think ELIZA was the first program to pass a Turing test, around 60 years ago.

empath75•1mo ago

I have an idea for a reverse turing test where humans have to convince an LLM that they are an LLM. I suspect that most people would fail, proving that humans lack intelligence.

IshKebab•1mo ago

Even some of the things that people think are just broken in LLMs are common in children, e.g. repeating things (getting stuck in a loop) or their inability to understand humour.

ACCount37•1mo ago

The good old "kilogram of steel" test is something that the children, the mentally infirm and the less capable LLMs all fail in the same way.

Some failures like that are simply human failures reproduced faithfully. Some are rooted deeper than that.

And yes, it's true that children don't get bored in the same way adults do, which often leads to repetitive behavior. Boredom is an important heuristic for behavior, it seems.

IshKebab•1mo ago

> repetitive behavior

I mean literally saying the same thing again and again. Like "And then I played and then I played and then I played and then I played..."

ACCount37•1mo ago

Yes, including that. It's just the small version of the thing LLMs are prone to.

A lot of LLM behaviors are self-reinforcing across context, and this includes small stupid loops and the more elaborate variants. Like an LLM making a reasoning mistake, catching it while checking itself, and then making it again, 5 times in a row.

postexitus•1mo ago

Tangentially related, but I enjoyed reading "The Most Human Human" by Brian Christian - granted it's written in a pre-LLM world, it's still very much relevant.

The book is following the annual Turing Test competition, in which, humans are chatting with AIs or real humans without knowing which is which and give them a score out of 10 for being most human and the AI that is "the most human" wins the competition. The twist is, not all humans get 10/10 for being human either - so the human that's the most human also wins a prize.

afspear•1mo ago

Maybe we should find other datasets not generated by humans to train LLMs?

threethirtytwo•1mo ago

The datasets going into LLMs have to have an element of human-ness to it.

For example I can’t just feed it weather data from the past decade and expect it to understand weather. It needs input and output pairs with the output being human language. So you can feed it weather data but it has to be paired with human description of said data. So if we give it data of a rain storm there has to be an english description paired with it saying it’s a rainstorm.

ACCount37•1mo ago

Sadly, we have n=1 for intelligence and that's humans. The "second best" of intelligence is already LLMs. And it's hard to expect imitation learning on data that wasn't produced by anything intelligent to yield intelligence - although there are some curious finds.

Even for human behavior: we don't have that much data. The current datasets don't capture all of human behavior - only the facets of it that can be glimpsed from text, or from video. And video is notoriously hard to use well in LLM training pipelines.

That LLMs can learn so much from so little is quite impressive in itself. Text being this powerful was, at its time, an extremely counterintuitive finding.

Although some of the power of modern LLMs already comes from nonhuman sources. RLVR and RLAIF are major parts of training recipes for frontier labs.

SubiculumCode•1mo ago

Is it too late to call it confabulation rather than hallucination? Its such a more appropriate term for both LLM "hallucinations" with an entire scientific literature on it in humans.

bookofjoe•1mo ago

Concur. Alas, that ship has sailed.

baq•1mo ago

Never say never. It'd take a few tweets from Karpathy et al.

wrs•1mo ago

I don't know about that...his initially-useful definition of "vibecoding" is just about hopelessly lost at this point.

officehero•1mo ago

Please define "confabulation" (for us stupid non-AI, non-native speakers)

SubiculumCode•1mo ago

https://en.wikipedia.org/wiki/Confabulation

officehero•3w ago

You missed the point.

tux•1mo ago

“In fact, so rare it is to find someone who knows what I mean that it feels like a magic moment.”

There, lack of interest from the person you talking to or you when listening. It’s because you have different interests. This is a human feature not a flaw. But it’s interesting to think that LLMs might have similar behavior :-)

“I’ll never again ask a human to write a computer program shorter than about a thousand lines, since an LLM will do it better.”

From my personal experience with ChatGPT it can’t even correctly write few lines of code. But i don’t use AI often. I just don’t find it that useful. From what i see it’s mostly a hype bubble that will burst.

But this is my personal opinion and my own observation. I could be wrong :-)

egypturnash•1mo ago

The best thing about a good deep conversation is when the other person gets you: you explain a complicated situation you find yourself in, and find some resonance in their replies. That, at least, is what happens when chatting with the recent large models. But when subjecting the limited human mind to the same prompt—a rather long one—again and again the information in the prompt somehow gets lost, their focus drifts away, and you have to repeat crucial facts. In such a case, my gut reaction is to see if there’s a way to pay to upgrade to a bigger model, only to remember that there’s no upgrading of the human brain.

Paying for someone to put some effort into giving a damn about what you have to say has a long history. Hire a therapist. Pay a teacher. Hire a hooker. Buy a round of drinks. Grow the really good weed and bring it to the party.

And maybe remember that other humans have their own needs and desires, and if you want them to put time and energy into giving a damn about your needs, then you need to reciprocate and spend time doing the same for them instead of treating them like a machine that exists only to serve you. This whole post is coming from a place of reducing every relationship to that and it's kind of disgusting.

vinceguidry•1mo ago

Yeah, shared context over time is the answer to all these problems and has been for both history and prehistory. Patience appears to be the scarcest resource of all these days.

chambored•1mo ago

This is exactly why I stopped reading after that section.

psunavy03•1mo ago

It's sadly also an attitude I'm not surprised to see coming out of tech, given how many people don't seem to get that "I got into this field so I could interact with computers, not people" is supposed to be a joke.

cjs_ac•1mo ago

> You know my favourite bit of that story? I just made it up. Yeah, it's not true. There is no Morgan. Ooh! It's very unsatisfying, isn't it? But I saw him in my head. I saw Morgan in my head.

> Why is it we can feel so robbed when someone tells us a story we just heard isn't true, and yet so satisfied at the end of a fictional novel? I don't know. I don't know.

-- Randy Writes a Novel

If the thing made by a machine is indistinguishable from the thing made by a human, the thing made by a human will be more valuable, simply because being made by a human is an opportunity for a story, and we humans like and value stories.

hotpotat•1mo ago

> my gut reaction is to see if there’s a way to pay to upgrade to a bigger model, only to remember that there’s no upgrading of the human brain

this might be one of the most sociopathic things I’ve ever read

Dilettante_•1mo ago

I think the author might not have been speaking 100% with his tongue not in his cheek there.

raygelogic•1mo ago

if you're actually struggling to get people to interact with you the way you want, I think the real problem is your expectations of other people. if they miss the plot, it might be because you timed the conversation poorly, or because you talked to the wrong person for what you need.

this whole post reads like it's coming from someone who sees people as tools to get what they need. the reason I talk to people when I'm struggling with a problem isn't for reference, but for connection, and to get my own wheels turning.

I'll grant that it's interesting to think about. now that LLMs exist, we're forced to assess what value human brains provide. it's so dystopian. but there's no other choice.

flowerthoughts•1mo ago

You're absolutely right!

But this type of... conflict aversion... is definitely more common in LLMs than in humans. Even the most positive humans I know sometimes crack.

rdiddly•1mo ago

Is he saying humans have become this way because of the influence of LLMs? Because actually the reverse is true.

BugsJustFindMe•1mo ago

> Is he saying humans have become this way because of the influence of LLMs?

No. The first paragraph explains it quite clearly, IMO: "While some are still discussing why computers will never be able to pass the Turing test, I find myself repeatedly facing the idea that as the models improve and humans don’t, the bar for the test gets raised and eventually humans won’t pass the test themselves."

The point is not that the problems exist more in humans now vs before. It's that they can be observed more significantly in humans than in LLMs (and moreso over time) if one cares to look because LLMs improve and humans do not on sub-evolutionary timescales. And perhaps our patience with them in humans is now diminished because of our experiences with them in LLMs and so people may notice them in humans more than before.

vinceguidry•1mo ago

Humans have always been this way, what we lack now is the patience to put up with it.

bs7280•1mo ago

I've noticed that a lot of people most skeptical of AI coding tools are biased by their experience working exclusively at some of the top software engineering organizations in the world. As someone who has never worked at a company anywhere close to FAANG, I have worked with both people and organization's that are horrifyingly incompetent. A lot of software organization paradigms are designed to play defense against poorly written software.

I feel similar about self driving cars - they don't have to be perfect when half the people on the road are either high, watching reels while driving, or both.

psunavy03•1mo ago

Few things enrage me like the smell of cannabis on the highway after it was legalized in my state. Sure, hypothetically, that's the passenger. But more likely than not, it's DUI.

macintux•1mo ago

Sitting in a Jeep with no doors, no top, no windows has revealed to me just how common cannabis is in my state, even not yet legalized. Hate the smell.

bs7280•1mo ago

Off topic of my original comment but I live in Chicago and have seen some of the most batshit insane drivers / behavior on the road you could imagine. People smoking are often the least of my worries (not to say its ok).

codyb•1mo ago

What, as opposed to the people on painkillers, xanax, caffeine, nicotine, and of course the actual worst... too little sleep, too much alcohol, and their phones.

psunavy03•1mo ago

Other things also being wrong does not make driving under the influence of cannabis any less wrong.

codyb•4w ago

The studies are actually pretty interesting here...

This article is from 2021 - https://www.iihs.org/news/detail/crash-rates-jump-in-wake-of...

The conclusion seems to be that if you _only_ smoke marijuana you're actually less likely to be involved in a crash than a sober driver, but if you combine marijuana with alcohol you're _more_ likely to crash (which, duh).

Obviously not totally conclusive, but interesting none the less. Anecdotally, coming from a high school where folk smoke and drove all the time because they couldn't smoke in their houses or on the street where they'd face police harassment, it was always the alcohol that got them nabbed for DUIs. It's anecdotal, but my anecdotes are many and I'm not sure I've heard of any one I've ever known crashing while just smoking weed.

So... maybe everyone should toke a little before they drive, sounds like they'd leave more distance between the cars in front of them, and go at a more relaxed pace, and not try to do any crazy passes of the people in front of them. Road rage is a very real thing in America, and the stereotype isn't of your typical stoner.

eru•3w ago

> The conclusion seems to be that if you _only_ smoke marijuana you're actually less likely to be involved in a crash than a sober driver, [...]

I assume that's not a randomised controlled study? Ie there's probably all kinds of confounders etc.

chelmzy•1mo ago

This has been my experience as well. I see very bright people lampooning LLMs because it doesn't perform up to their expectations when they are easily in the top 1% of talent in their field. I don't think they understand the cognitive load in your average F500 role is NOT very high. Most people are doing jack shit.

demorro•1mo ago

Everyone is still holding out hope for a better future. LLM advocates making this argument are saying that the field can never improve, so might as well just let the mediocre machine run rampant.

Perhaps idealistic, perhaps unrealistic. I'd still rather believe.

chelmzy•1mo ago

I think AI adoption is going to be catastrophic and my only hope is that we can slow down and tread carefully. Chances that occurs are slim. I'm certainly not pro AI. It just really angers me to see people still denying the impact.

eru•3w ago

What catastrophes do you expect?

theshrike79•4w ago

Exactly, we are focusing on the absolute amount of crashes by "self driving" cars.

What we should focus is that are they more or less prone to accidents than actual humans based on amount of km driven.

Again, there are those Expert Drivers who love their manual transmission BMW because automatics shift in the wrong RPM range and abhor any kind of lane assist because it doesn't drive EXACTLY like they do.

But the vast majority of average people on the road will definitely get gains from lane assist and lane keeping functions in cars.

eru•3w ago

> A lot of software organization paradigms are designed to play defense against poorly written software.

Maybe. But the organisations who would need the defense most are the some of the least likely to apply them.

Eg it was better run organisations that had version control early, and the worse ones persisted with using shared folders for longer.

And strong type systems like what Haskell or to a lesser extent Rust have to offer are useful as safeguards for anyone, but even more useful when your organisation and its members aren't all that great. Yet again, we see more capable organisations adopting these earlier.

DenisM•1mo ago

> When a model exhibits hallucination, often providing more context and evidence will dispel it,

I usually have the opposite experience. One a model goes off the rails it becomes harder and harder to steer and after a few corrective prompts they stop working and it’s time for a new context.

ACCount37•1mo ago

It depends.

It's a natural inclination for all LLMs, rooted in pre-training. But you can train them out of it some. Or not.

Google doesn't know how to do it to save their lives. Other frontier labs are better at it, but none are perfect as of yet.

foobiekr•1mo ago

Once it’s in the context window the model invariably steers crazy. Llms cannot handle the “don’t think of an elephant” requirement.

Jiro•1mo ago

This article is nonsense. It's taking advantage of the fact that the problems with LLMs are being described with very broad wording, and then noticing that you can fit human behavior into those descriptions because of how broadly they are worded.

It's like getting a gorilla to fly an airplane, noticing that it crashed the airplane, and saying "humans sometimes crash airplanes too". Both gorillas and humans do things that fit into the broad category "crash an airplane" but the details and circumstances are different.

throw4847285•1mo ago

Arguments like this make me suspect that the proponents have simply a malformed theory of mind. If I'm being really catty, I'll say it's because they have below average levels of self-awareness.

Dilettante_•1mo ago

I have definitely, absolutely, positively had conversations where details have fallen out of the context window of my conversation partner(or mine, for that matter), without the person in question realizing this has happened, and have only via LLMs found a vocabulary to give a name to the phenomenon.

somewhereoutth•1mo ago

It is a common trope that tech people have a good understanding of computers, but a bad understanding of people. I see no evidence here to dispel that trope.

canjobear•1mo ago

> the bar for the test gets raised and eventually humans won’t pass the test themselves.

At this point LLMs usually beat humans at the Turing Test! People are more likely to pick the LLM as the human, rather than the human. https://arxiv.org/abs/2503.23674

breuleux•1mo ago

What we observe is also consistent with the idea that when humans have no idea what they're talking about, it's usually more obvious than when LLMs have no idea what they're talking about. In which case the author is lulling themselves into a false sense of confidence chatting with AI instead of humans, merely trading one form of incompetence for another.

crazygringo•1mo ago

> when humans have no idea what they're talking about, it's usually more obvious

Is it?

That's not my experience.

breuleux•1mo ago

I think so, yes. We rely a lot on eloquence and general knowledge as signals of competence, and LLMs beat most people at these. That's the "usually" -- I don't think good human bullshitters are more obvious than LLMs.

This may not apply to you if you regard LLMs, including their established rhetorical patterns, with greater suspicion or scrutiny (and you should!) It also does not apply when talking about subjects in which you are knowledgeable. But if you're chatting about things you are not knowledgeable about, and you treat the LLM just like any human, I think it applies. There's a reason LLM psychosis is a thing, rhetorically these things can simulate the ability of a cult leader.

crazygringo•1mo ago

I think I'm going to have to disagree. When people tell you something incorrect, they usually believe it's correct and that they're trying to help. So it comes across with full confidence, helpfulness, and a trustworthy attitude. Plus people often come with credentials -- PhD's, medical degrees, etc. -- so we're even more caught off-guard when they turn out to be totally and completely wrong about something.

On the other hand, LLM's are just text on a screen. There are zero of the human signals that tell us someone is confident or trustworthy or being helpful. It "feels" like any random blog post from someone I don't know. So it makes you want to verify it.

jdauriemma•1mo ago

The narrative structure of the article would be brilliant satire but I'm 90% certain that the author is serious about the conclusions they drew at the end, which I find sad.

jdthedisciple•1mo ago

Yea I find it a bit condescending. Humans ain't robots, duh!

And the world wouldn't function if everyone operated at the exact same abstraction level of ideas.

jdauriemma•1mo ago

The big difference is accountability. An LLM has no mortality; it has no use for fear, no embodied concept of reputation, no persistent values. Everything is ephemera. But they are useful! More useful than humans in some scenarios! So there's that. But when I consider the purpose of conversation, utility is only one consideration among many.

t8sr•1mo ago

Something about the way the author expresses himself (big words, “I am so smart”, flowery filler) makes me unsurprised he finds it hard to have satisfying conversations with people. If he talked to me like this IRL I wouldn’t be trying to have a deep conversation either, I’d just be looking for the exit.

Lacking a theory of mind for other people is not a sign of superiority.

sph•1mo ago

Sadly that’s a personality trait that’s far too common in the field, and it can get pretty annoying.

hug•1mo ago

Jumping from "the author uses language I dislike" straight to "also, he has no theory of mind" is a bit of a leap. Like world record winning long jump kinda stuff.

Also, what big words? 'Proliferation'? 'Incoherent'? The whole article is written at a high school reading level. There's some embedded clauses in longer sentences, but we're not exactly slogging our way through Proust, here.

josecodea•2w ago

You are being too generous by saying that there are big words in the text. I find it blunt and uncouth. Actually, that's the problem that I see in the text, an attitude of pessimism and lack of self-reflection. An LLM would certainly give me something more interesting to read!

delichon•1mo ago

Humans retreating from other humans for their social needs (as with pornography before AI) takes the propinquity away from reproduction. Breeding becomes a progressively more intentional, high agency act. Without some compensating force, fertility drops. Artificial companionship may be an existential threat to a social species.

catskul2•1mo ago

... or circumstances now select for "high agency" humans, and those humans propagate.

Not sure why we think normal evolution wouldn't just route around such problems.

delichon•1mo ago

It certainly will but not necessarily to our species' benefit. I'd be less concerned if I understood why this happens in mice:

https://www.the-scientist.com/universe-25-experiment-69941

sph•1mo ago

Evolution operates in much larger time scales.

Dilettante_•1mo ago

Evolution operates at every time scale.

tpoacher•1mo ago

Reminds me of that quote: “There is considerable overlap between the intelligence of the smartest bears and the dumbest tourists.” [0]

[0] https://www.schneier.com/blog/archives/2006/08/security_is_a...

moribvndvs•4w ago

> Another interpretation would be to conclude cynically that it’s time humans get either enhanced or replaced by a more powerful form of intelligence.

Perhaps the author is just gaming out a thought experiment, but I’ll just take it at face value. I am genuinely baffled by the obsequiousness some people display regarding LLMs. Let’s assume it really is a more powerful form of intelligence (ugh) and it “replaces” people, how do you think that ends for you?

You are trying to convince yourself that you’ve happened upon a benevolent god that truly, deeply understands you while staring into a reflection pool.

kundan_s__r•3w ago

Interesting reflection — but I’d push back on treating surface similarities between human conversational quirks and LLM failure modes as evidence they’re really the same thing. The article lists things like “not stopping generating,” “small context window,” and “repeating mistakes,” and frames them as if current LLM patterns are just human behaviors seen through a new lens.

The key difference is mechanism. For humans, wandering off topic or repeating a point usually stems from attention, memory, or social dynamics. For LLMs, similar-looking behavior often arises from architectural and statistical limitations — e.g., limited context windows causing context drift that makes the model lose track of earlier facts over long conversations. That’s been documented as a core technical barrier in conversational AI systems.

Likewise, what the article calls “persistent hallucination” isn’t just “being confidently wrong like a human might be” — it’s a systemic property of how LLMs predict the next token based on learned patterns rather than grounding in external truth. Hallucinations are literally outputs that are factually incorrect or fabricated even if they read fluently. That’s why engineering teams talk about hallucination detection, grounding, or external verification layers for production use.

Framing them as the same as human flaws risks obscuring the real challenge: deploying LLMs in systems where correctness, consistency, and adherence to explicit intent matter. Those aren’t just philosophical problems; they’re system-design constraints that require mechanism-aware validation and monitoring, not just metaphors to human behavior.

Interop 2025: A Year of Convergence

JobArena – Human Intuition vs. Artificial Intelligence

Concept Artists Say Generative AI References Only Make Their Jobs Harder

Show HN: PaySentry – Open-source control plane for AI agent payments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

The Crumbling Workflow Moat: Aggregation Theory's Final Chapter

Pax Historia – User and AI powered gaming platform

Show HN: I built a RAG engine to search Singaporean laws

Scams, Fraud, and Fake Apps: How to Protect Your Money in a Mobile-First Economy

Porting Doom to My WebAssembly VM

Cognitive Style and Visual Attention in Multimodal Museum Exhibitions

Full-Blown Cross-Assembler in a Bash Script

Logic Puzzles: Why the Liar Is the Helpful One

Optical Combs Help Radio Telescopes Work Together

Show HN: Myanon – fast, deterministic MySQL dump anonymizer

The Tao of Programming

Forcing Rust: How Big Tech Lobbied the Government into a Language Mandate

PanelBench: We evaluated Cursor's Visual Editor on 89 test cases. 43 fail

Can You Draw Every Flag in PowerPoint? (Part 2) [video]

Show HN: MCP-baepsae – MCP server for iOS Simulator automation

Make Trust Irrelevant: A Gamer's Take on Agentic AI Safety

Show HN: Sem – Semantic diffs and patches for Git

Hello world does not compile

Show HN: ZigZag – A Bubble Tea-Inspired TUI Framework for Zig

Metaphor+Metonymy: "To love that well which thou must leave ere long"(Sonnet73)

Show HN: Django N+1 Queries Checker

Emacs-tramp-RPC: High-performance TRAMP back end using JSON-RPC instead of shell

Protocol Validation with Affine MPST in Rust

Female Asian Elephant Calf Born at the Smithsonian National Zoo

Show HN: Zest – A hands-on simulator for Staff+ system design scenarios

Interop 2025: A Year of Convergence

JobArena – Human Intuition vs. Artificial Intelligence

Concept Artists Say Generative AI References Only Make Their Jobs Harder

Show HN: PaySentry – Open-source control plane for AI agent payments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

The Crumbling Workflow Moat: Aggregation Theory's Final Chapter

Pax Historia – User and AI powered gaming platform

Show HN: I built a RAG engine to search Singaporean laws

Scams, Fraud, and Fake Apps: How to Protect Your Money in a Mobile-First Economy

Porting Doom to My WebAssembly VM

Cognitive Style and Visual Attention in Multimodal Museum Exhibitions

Full-Blown Cross-Assembler in a Bash Script

Logic Puzzles: Why the Liar Is the Helpful One

Optical Combs Help Radio Telescopes Work Together

Show HN: Myanon – fast, deterministic MySQL dump anonymizer

The Tao of Programming

Forcing Rust: How Big Tech Lobbied the Government into a Language Mandate

PanelBench: We evaluated Cursor's Visual Editor on 89 test cases. 43 fail

Can You Draw Every Flag in PowerPoint? (Part 2) [video]

Show HN: MCP-baepsae – MCP server for iOS Simulator automation

Make Trust Irrelevant: A Gamer's Take on Agentic AI Safety

Show HN: Sem – Semantic diffs and patches for Git

Hello world does not compile

Show HN: ZigZag – A Bubble Tea-Inspired TUI Framework for Zig

Metaphor+Metonymy: "To love that well which thou must leave ere long"(Sonnet73)

Show HN: Django N+1 Queries Checker

Emacs-tramp-RPC: High-performance TRAMP back end using JSON-RPC instead of shell

Protocol Validation with Affine MPST in Rust

Female Asian Elephant Calf Born at the Smithsonian National Zoo

Show HN: Zest – A hands-on simulator for Staff+ system design scenarios

LLM Problems Observed in Humans

Comments