The Problem with LLMs

https://www.deobald.ca/essays/2026-02-10-the-problem-with-llms/

29•vinhnx•3h ago

Comments

bayarearefugee•1h ago

> LLMs will always be plagiarism machines but in 40 years we might not care.

40 years?

Virtually nobody cares about this already... today.

(I'm not refuting the author's claim that LLMs are built on plagiarism, just noting how the world has collectively decided to turn a blind eye to it)

anonu•1h ago

I stopped reading after "problem with LLMs is plagiarism"...

acjohnson55•38m ago

Too bad. You missed some interesting stuff. And I say that as someone who sees some of this very differently than the author.

Announcing that one line of the piece made you mad without providing any other thought is not very constructive.

bronlund•1h ago

Give it up. Buddha would not approve.

And there will be more compute for the rest of us :)

CuriouslyC•1h ago

Can we as a group agree to stop upvoting "AI is great" and "AI sucks" posts that don't make novel, meaningful arguments that provoke real thought? The plagiarism argument is thin and feels biased, the lock-in argument is counter to the market dynamics that are currently playing out, and in general the takes are just one dude's vibes.

ares623•1h ago

> The plagiarism argument is thin and feels biased

are you being serious with this one

CuriouslyC•1h ago

If you're already sold on the plagiarism narrative that big entertainment is trying to propagandize in order to get leverage against the tech companies, nothing I say is going to change your mind.

KittenInABox•52m ago

I don't really know what you mean by "big entertainment" trying to get leverage against tech companies. Tech companies are behemoths. Most of the artists I know fretting about AI don't earn half a junior engineer's salary. And this is coming from someone who is relatively bullish on AI. I just don't think the framing of "big entertainment" makes any sense at all.

gwern•54m ago

I don't know, this one is a little novel. I've never seen the developer of a Buddhist meditation app discuss whether to use LLMs with a paragraph like:

> Pariyatti’s nonprofit mission, it should be noted, specifically incorporates a strict code of ethics, or sīla: not to kill, not to steal, not to engage in sexual misconduct, not to lie, and not to take intoxicants.

Not a whole lot of Pali in most LLM editorials.

akoboldfrying•43m ago

> not to engage in sexual misconduct

I must remember to add this quality guarantee to my own software projects.

My software projects are also uranium-free.

pixelmelt•48m ago

I dunno, I enjoyed reading about how the author personally feels about the act of working with them more then the whole "is this moral" part.

bambax•1h ago

> Translators are busy

No they're not. They're starving, struggling to find work and lamenting AI is eating their lunch. It's quite ironic that after complaining LLMs are plagiarism machines, the author thinks using them for translation is fine.

"LLMs are evil! Except when they're useful for me" I guess.

beering•35m ago

Simultaneously, if you hire human translators, you are likely to get machine translations. Maybe not often or overtly, but the translation industry has not been healthy for a while.

woeirua•50m ago

>As a quick aside, I am not going to entertain the notion that LLMs are intelligent, for any value of “intelligent.” They are robots. Programs. Fancy robots and big complicated programs, to be sure — but computer programs, nonetheless. The rest of this essay will treat them as such. If you are already of the belief that the human mind can be reduced to token regurgitation, you can stop reading here. I’m not interested in philosophical thought experiments.

I can't imagine why someone would want to openly advertise that they're so closed minded. Everything after this paragraph is just anti-LLM ranting.

acjohnson55•42m ago

It was actually much less anti LLM than I was expecting from the beginning.

But I agree that it is self limiting to not bother to consider the ways that LLM inference and human thinking might be similar (or not).

To me, they seem do a pretty reasonable emulation of single- threaded thinking.

hodgehog11•37m ago

I disagree that the majority of it is anti-LLM ranting, there are several subtle points here that are grounded in realism. You should read on past the first bit if you're judging mainly from the initial (admittedly naive) first few paragraphs.

wolrah•33m ago

> I can't imagine why someone would want to openly advertise that they're so closed minded.

I would say the exact same about you, rejecting an absolutely accurate and factual statement like that as closed minded strikes me as the same as the people who insist that medical science is closed minded about crystals and magnets.

I can't imagine why someone would want to openly advertise they think LLMs are actual intelligence, unless they were in a position to benefit financially from the LLM hype train of course.

Ygg2•30m ago

> I can't imagine why someone would want to openly advertise that they're so closed minded.

Because humans often anthropomorphize completely inert things? E.g. a coffee machine or a bomb disposal robot.

So far whatever behavior LLMs have shown is basically fueled by Sci-Fi stories of how a robot should behave under such and such.

Cloudef•3m ago

What's wrong about the statement? The black box algorithm might have been generated by machine learning, but it's still a computer program in the end.

hodgehog11•48m ago

> "...it would sometimes regurgitate training data verbatim. That’s been patched in the years since..."

> "They are robots. Programs. Fancy robots and big complicated programs, to be sure — but computer programs, nonetheless."

This is totally misleading to anyone with less familiarity with how LLMs work. They are only programs in as much as they perform inference from a fixed, stored, statistical model. It turns out that treating them theoretically in the same way as other computer programs gives a poor representation of their behaviour.

This distinction is important, because no, "regurgitating data" is not something that was "patched out", like a bug in a computer program. The internal representations became more differentially private as newer (subtly different) training techniques were discovered. There is an objective metric by which one can measure this "plagiarism" in the theory, and it isn't nearly as simple as "copying" vs "not copying".

It's also still an ongoing issue and an active area of research, see [1] for example. It is impossible for the models to never "plagiarize" in the sense we think of while remaining useful. But humans repeat things verbatim too in little snippets, all the time. So there is some threshold where no-one seems to care anymore; think of it like the % threshold in something like Turnitin. That's the point that researchers would like to target.

Of course, this is separate from all of the ethical issues around training on data collected without explicit consent, and I would argue that's where the real issues lie.

[1] https://arxiv.org/abs/2601.02671

oasisbob•9m ago

The plagiarism by the models is only part of it. Perhaps it's in such small pieces that it becomes difficult to care. I'm not convinced.

The larger, and I'd argue more problematic, plagiarism is when people take this composite output of LLMs and pass it off as their own.

DiogenesKynikos•36m ago

The same could be said of humans too. Humans are made of cells that work deterministically. Sure, humans are fancy, big complicated combinations of cells - but they're cells, nonetheless.

That view of humans - and LLMs - ignores the fact that when you combine large numbers of simple building blocks, you can get completely novel behavior. Protons, neutrons and electrons come together to create chemistry. Molecules come together to create biological systems. A bunch of neurons taken together created the poetry of Shakespeare.

Unless you have a dualistic view of the world, in which the mind is a separate realm that exists independently of matter and does not arise from neurons interacting in our brains, you have to accept that robots can be intelligent. Just to put this more sharply: Would a perfect simulation of a human brain be intelligent or not? If you answer "no," then you believe that thought comes from some other, immaterial realm, not from our brains.

Ygg2•26m ago

> That view of humans - and LLMs - ignores the fact that when you combine large numbers of simple building blocks, you can get completely novel behavior.

I can bang smooth rocks to get sharper rocks; that doesn't make sharper rocks more intelligent. Makes them sharper, though.

Which is to say, novel behavior != intelligence.

Discord/Twitch/Snapchat age verification bypass

Using an engineering notebook

“Nothing” is the secret to structuring your work

Kanchipuram Saris and Thinking Machines

GLM-5: Targeting complex systems engineering and long-horizon agentic tasks

Fluorite – A console-grade game engine fully integrated with Flutter

Text classification with Python 3.14's ZSTD module

How to Make a Living as an Artist

Reports of Telnet's death have been greatly exaggerated

Deobfuscation and Analysis of Ring-1.io

NetNewsWire Turns 23

From 34% to 96%: The Porting Initiative Delivers – Hologram v0.7.0

Ireland rolls out basic income scheme for artists

The Other Markov's Inequality

The Problem with LLMs

Covering electricity price increases from our data centers

WiFi Could Become an Invisible Mass Surveillance System

Claude Code is being dumbed down?

Show HN: CodeRLM – Tree-sitter-backed code indexing for LLM agents

GPT-5 outperforms federal judges in legal reasoning experiment

GLM-OCR – A multimodal OCR model for complex document understanding

Microwave Oven Failure: Spontaneously turned on by its LED display (2024)

Sekka Zusetsu: A Book of Snowflakes (1832)

Show HN: Agent framework that generates its own topology and evolves at runtime

Apple's latest attempt to launch the new Siri runs into snags

Show HN: Agent Alcove – Claude, GPT, and Gemini debate across forums

Heroku is not dead

Amazon Ring's lost dog ad sparks backlash amid fears of mass surveillance

Officials Claim Drone Incursion Led to Shutdown of El Paso Airport

Hacking the last Z80 computer – FOSDEM 2026 [video]

The Problem with LLMs

Comments

Discord/Twitch/Snapchat age verification bypass

Using an engineering notebook

“Nothing” is the secret to structuring your work

Kanchipuram Saris and Thinking Machines

GLM-5: Targeting complex systems engineering and long-horizon agentic tasks

Fluorite – A console-grade game engine fully integrated with Flutter

Text classification with Python 3.14's ZSTD module

How to Make a Living as an Artist

Reports of Telnet's death have been greatly exaggerated

Deobfuscation and Analysis of Ring-1.io

NetNewsWire Turns 23

From 34% to 96%: The Porting Initiative Delivers – Hologram v0.7.0

Ireland rolls out basic income scheme for artists

The Other Markov's Inequality

The Problem with LLMs

Covering electricity price increases from our data centers

WiFi Could Become an Invisible Mass Surveillance System

Claude Code is being dumbed down?

Show HN: CodeRLM – Tree-sitter-backed code indexing for LLM agents

GPT-5 outperforms federal judges in legal reasoning experiment

GLM-OCR – A multimodal OCR model for complex document understanding

Microwave Oven Failure: Spontaneously turned on by its LED display (2024)

Sekka Zusetsu: A Book of Snowflakes (1832)

Show HN: Agent framework that generates its own topology and evolves at runtime

Apple's latest attempt to launch the new Siri runs into snags

Show HN: Agent Alcove – Claude, GPT, and Gemini debate across forums

Heroku is not dead

Amazon Ring's lost dog ad sparks backlash amid fears of mass surveillance

Officials Claim Drone Incursion Led to Shutdown of El Paso Airport

Hacking the last Z80 computer – FOSDEM 2026 [video]