- consumer marketing
- politics
- venture fundraising
When any system has a few power law winners, it makes sense to grab attention.
Look at Trump and Musk and now Altman. They figured it out.
MrBeast...
Attention, even if negative, wedges you into the system and everyone's awareness. Your mousey quiet competitors aren't even seen or acknowledged. The attention grabbers suck all the oxygen out of the room and win.
If you go back and look at any victory, was it really better solutions, or was it the fact that better solutions led to more attention?
"Look here" -> build consensus and ignore naysayers -> keep building -> feedback loop -> win
It might not just be a societal algorithm. It might be one of the universe's fundamental greedy optimization algorithms. It might underpin lots of systems, including how we ourselves as individuals think and learn.
Our pain receptors. Our own intellectual interests and hobbies. Children learning on the playground. Ant colonies. Bee swarms. The world is full of signals, and there are mechanisms which focus us on the right stimuli.
Maybe read my comment at face value. I do have a point tangential to the discussion at hand.
LLMs trained on me (and the Hacker News corpus), not the other way around.
If you could just spam annoy until you win, we'd be all dancing to remixed versions of Macarena.
> (...) We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely.
The idea that LLMs are just trained on a pile of raw Internet is severely outdated. (Not sure it was ever fully true, but it's far away from that by now).
Coding's one of the easier datasets to curate, because we have a number of ways to actually (somewhat) assess code quality. (Does it work? Does it come with a set of tests and pass it? Does it have stylistic integrity? How many issues get flagged by various analysis tools? Etc, etc)
>"“Brain Rot” for LLMs isn’t just a catchy metaphor—it reframes data curation as cognitive hygiene for AI"
A metaphor is exactly what it is because not only do LLMs not possess human cognition, there's certainly no established science of thinking they're literally valid subjects for clinical psychological assessment.
How does this stuff get published, this is basically a blog post. One of the worse aspects of the whole AI craze is that is has turned a non-trivial amount of academia into a complete cargo cult joke.
I think it's intended as a catchy warning to people who are dumping every piece of the internet (and synthetic data based on it!) that there are repercussions.
Letting researchers pollute it with blog-gunk is an abuse of the referral/vetting system for submitters.
The two bits about this paper that I think are worth calling out specifically:
- A reasonable amount of post-training can't save you when your pretraining comes from a bad pipeline; ie. even if the syntactics of the input pretrained data are legitimate it has learned some bad implicit behavior (thought skipping)
- Trying to classify "bad data" is itself a nontrivial problem. Here the heuristic approach of engagement actually proved more reliable than an LLM classification of the content
TLDR: If your data set is junk, your trained model/weights will probably be junk too.
[0] https://books.google.se/books?id=KOUCAAAAMBAJ&pg=PA48&vq=ses...
There were psychologists who talked about zone of proximal development[0], about importance of exposing a learner to tasks that they cannot do without a support. But I can't remember nothing about going further and exposing a learner to tasks far above their heads when they cannot understand a word.
There is a legend about Sofya Kovalevskaya[1], who became a noteworthy mathematician after she were exposed to lecture notes by Ostrogradsky when she was 11 yo. The walls of her room were papered with those notes and she was curious what are all that symbols. It doesn't mean that there is a causal link between these two events, but what if there is one?
What about watching deep analytical TV show at 9 yo? How it affect the brain development? I think no one tried to research that. My gut feeling that it can be motivational, I didn't understand computers when I met them first, but I was really intrigued by them. I learned BASIC and it was like magic incantations. It had build a strong motivation to study CS deeper. But the question is are there any other effects beyond motivation? I remember looking at the C-program in some book and wondering what does it all mean. I could understand nothing, but still I had spent some time trying to decipher the program. Probably I had other experiences like that, which I do not remember now. Can we say with certainty that it had no influence on my development and hadn't make things easier for me later?
> So maybe we should check in on the boomers too if we're sincere about these worries.
Probably we should be sincere.
[0] https://en.wikipedia.org/wiki/Zone_of_proximal_development
An LLM-written line if I’ve ever seen one. Looks like the authors have their own brainrot to contend with.
The issue is how tools are used, not that they are used at all.
Whether it’s a tsunami and whether most people will do it has no relevance to my expectation that researchers of LLMs and brainrot shouldn’t outsource their own thinking and creativity to an LLM in a paper that itself implies that using LLMs causes brainrot.
Seems like none to me.
The problem isn’t using AI—it’s sounding like AI trying to impress a marketing department. That’s when you know the loop’s closed.
It doesn’t help writing it stultifies and gives everything the same boring cheery yet slightly confused tone of voice.
Are you describing LLM's or social media users?
Dont conflate how the content was created with its quality. The "You must be at least this smart (tall) to publish (ride)" sign got torn down years ago. Speakers corner is now an (inter)national stage and it written so it must be true...
Basically, I think the brain rot aspect might be a bit of terminology distraction here, when it seems what they're measuring is whether it's a puff piece or dense.
[0]: https://www.forbes.com/sites/traversmark/2024/05/17/why-kids...
Brainrot created by LLMs is important to worry about, their design as "people pleasers".
Their anthropomorphization can be scary too, no doubt.
https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-38/segm...
I spotted here a large number of things that it would be unwise to repeat here. But I assume the data cleaning process removes such content before pretraining? ;)
Although I have to wonder. I played with some of the base/text Llama models, and got very disturbing output from them. So there's not that much cleaning going on.
I didn't check what you're referring to but yes, the major providers likely have state of the art classifiers for censoring and filtering such content.
And when that doesn't work, they can RLHF the behavior from occurring.
You're trying to make some claim about garbage in/garbage out, but if there's even a tiny moat - it's in the filtering of these datasets and the purchasing of licenses to use other larger sources of data that (unlike Common Crawl) _aren't_ freely available for competition and open source movements to use.
Is this slop?
and you know what they say, if it walks like slop, quacks like slop and talks like slop, it's probably slop
If you look at two random patterns of characters and both contain 6s you could say they are similar (because you’re ignoring that the similarity is less than 0.01%). That’s how comparing LLMs to brains feels like. Like roller skates to a cruise ship. They both let you get around.
"Cool" and "for real" are no different than "rizz" and "no cap". You spoke "brain rot" once, and "cringed" when your parents didn't understand. The cycle repeats.
Brain rot in this context is not a reference to slang.
AznHisoka•5h ago
PaulHoule•5h ago
ForHackernews•4h ago
PaulHoule•15m ago
sailingparrot•4h ago
xpe•46m ago
Right: in the context of supervised learning, this statement is a good starting point. After all, how can one build a good supervised model if you can't train it on good examples?
But even in that context, it isn't an incisive framing of the problem. Lots of supervised models are resilient to some kinds of error. A better question, I think, is: what kinds of errors at what prevalence tend to degrade performance and why?
Speaking of LLMs and their ingestion processing, there is a lot more going on than purely supervised learning, so it seems reasonable to me that researchers would want to try to tease the problem apart.
rriley•3h ago
TL;DR from https://unrav.io/#view/8f20da5f8205c54b5802c2b623702569