I don't disagree with your overall assessment, but I'm curious about the basis for specific attribution to Claude?
Also if that's actually the case, it's incredibly ironic that we have a person who, presumably, is supposed to possess "intelligence", relying on a language model to help formulate and express their ideas about why models can't be intelligent.
This is the sort of thing that might have made sense in 2005, not 2025.
Is "human intelligence" and "intelligence" equal?
And: How to become conscious before being intelligent?
Or: If intelligence is a side-effect, how often this side-effect can't be observed?
Xor: What if an intelligent being denies being conscious?
It's an intellectually lazy argument to make. Like saying "we will never walk on the moon" in the 1900's.
Imminent AGI/ASI/God-like AI/end of humanity hawks are part of a growing AI cult. The cult leaders are driven by insatiable greed and the gullible cult followers are blinded by hope.
And I say this as a developer who is quite pleased with the progress of coding assistant tools recently.
He is postulating that the way we deal with physical memory (an example of L2 and L3 caches is provided) demonstrates that, as we proceed with trying to form AGI out of our classical computer architectures, there are some fundamental problems for example larger caches are slower. With human intelligence, this doesn't always seem to be a problem for some humans. If you understand the "attention" part of recent developments in this field, he's saying that transformers are the most efficient way we got to achieve that and it's starting to look like a problem from a "physicality" standpoint as the author puts it.
The "physically" here is that larger caches are not just computationally larger banks of data but are actually physically larger and, by Euclidean distance, further away. Yet paradoxically the elephant nor blue whale is not the smartest brain on the planet, the distance from the center of my head to broca's region seems to have no effect on my elocution. Etc. Studying Einstein's brain doesn't do much (I guess the insulators are somewhat important?) for understanding Einstein's intelligence ... but that is 100% critical when understanding L2 and L3 caches on a die.
> Who is to say biological computers won't be built?
No one is saying that. We're just pretty sure it's not happening in our lifetimes.
I think you misunderstand the author's intent. He's not saying "The things I can't imagine are not going to happen". He's trying to argue that, "Look, the way things are going, the diminishing returns we are already seeing, the way our hardware works, this isn't going to get us to AGI." Of course, if you had some new architecture or weird "wetware" that somehow solved these problems I'm sure this article would concede that that's not the point.
Hofstadter was on to something with his “strange loops” idea. He never really defined it well enough to operationalize, but I have a hard time faulting him for that because it’s not like anyone else has managed to articulate a less wooly theory. And it’s true that we’ve managed to incorporate plenty of loops into LLMs, but they are still pretty light on strangeness.
For example, something brains do that I suspect is essential to biological intelligence is self-optimization: they can implement processes using inefficient-but-flexible conscious control, and then store information about that process in fast but volatile chemical storage, and then later consolidate that learning and transfer it to physically encoded neural circuits, and even continue to optimize those physical circuits over time if they get signals to indicate that doing so would be worth the space and energy cost.
Comparing that kind of thing to how LLMs work, I come away with the impression that the technology is still pretty primitive and we’re just making up for it with brute force. Kind of the equivalent of getting to an output of 10,000 horsepower by using a team of 25,000 horses.
You usually see this lack of imagination in the crusty old sciences, not in something as fast moving as this field. Props to the guy for being ahead off the curve though.
I'm the first to shit on anyone who thinks current LLMs will take us to AGI, but I'm far from insane enough to claim this is the end of the road.
Sure, that's a sensible stance to take... assuming that there will be no further technological development.
The article is a bit rambling, but the main claims seem to be:
1) Computers can't emulate brains due to architecture (locality, caching, etc) and power consumption
2) GPUs are maxxing out in terms of performance (and implicitly AGI has to use GPUs)
3) Scaling is not enough, since due to 2) scaling is close to maxxing out
4) AGI won't happen because he defines AGI as requiring robotics, and seeing scaling of robotic experience as a limiting factor
5) Superintelligence (which he associates with self-improving AGI) won't happen because it'll again require more compute
It's a strange set of arguments, most of which don't hold up, and both manages to miss what is actually wrong with the current approach, and to conceive of what different approach will get us to AGI.
1) Brains don't have some exotic architecture than somehow gives them an advantage over computers in terms of locality, etc. The cortex is in fact basically a 2-D structure - a sheet of cortical columns, with a combination of local and long distance connections.
Where brains are different from a von-neumann architecture is that compute & memory are one and the same, but if we're comparing communication speed between different cortical areas, or TPU/etc chips, then the speed advantage goes to the computer.
2) Even if AGI had to use matmul and systolic arrays, and GPUs are maxxing out in terms of FLOPs, we could still scale compute, if needed, just by having more GPUs and faster and.or wider interconnect.
3) As above, it seems we can scale compute just by adding more GPUs and faster interconnect if needed, but in any case I don't think inability to scale is why AGI isn't about to emerge from LLMs.
4) Robotics and AGI are two separate things. A person lying in a hospital bed still has a brain and human-level AGI. Robots will eventually learn individually on-the-job, just as non-embodied AGI instances will, so size of pre-training datasets/experience will become irrelevant.
5) You need to define intelligence before supposing what super-human intelligence is and how it may come about, but Dettmers just talks about superintelligence in hand-wavy fashion as something that AGI may design, and assumes that whatever it is will require more compute than AGI. In reality intelligence is prediction and is limited in domain by your predictive inputs, and in quality/degree by the sophistication of your predictive algorithms, neither of which necessarily need more compute.
What is REALLY wrong with the GPT LLM approach, and why it can't just be scaled to achieve AGI, is that it is missing key architectural and algorithmic components (such as incremental learning, and a half dozen others), and perhaps more fundamentally that auto-regressive self-prediction is just the wrong approach. AGI needs to learn to act and predict the consequences of it's own actions - it needs to predict external inputs, not generative sequences.
Whether the AGI that they announce is actually AGI or not is a completely different debate. The goalposts will just continue to be moved until the statement is true.
Much better in my opinion is Hans Moravec's 1997 paper "When will computer hardware match the human brain?" which seems quite solid in it's reasoning - he was a roboticist and spending time trying to make robots do things like vision and given at the time they understood the retina quite well he could compare the amount of compute needed to do something equivalent to a given volume of neurons and then multiply up by the size of the brain. Conclusions were
> human brain is about 100,000 times as large as the retina, suggesting that matching overall human behavior will take about 100 million MIPS of computer power
100 TFLOPS roughly. Or similar to a top graphics card now. Also:
> Based on extrapolation of past trends and on examination of technologies under development, it is predicted that the required hardware will be available in cheap machines in the 2020s.
which seems to have come to pass. I think we don't have AGI yet because the LLM algorithms are very inefficient and not right for the job - it was more a language translation algorithm that surprised people by getting quite smart if you threw huge compute and the whole internet at it.
(Moravec's paper which is not a bad read https://jetpress.org/volume1/moravec.pdf)
It also has
>This paper describes how the performance of AI machines tends to improve at the same pace that AI researchers get access to faster hardware.
My guess is that will come to pass. AI researchers have the hardware and the software will improve shortly.
This has always been the issue. This is an argument I made more than 20 years ago. AGI, whatever it is as a technical problem, is mainly a TESTING problem. If you don’t solve that, then AGI is remains a matter of faith. A cult.
You and I might agree that it's not AGI, but that's not going to stop Sam Altman from using such a bogus claim to pump share prices right before an IPO.
One analogy for this is cars. We found out it’s enough to pave a network of roads for “good enough” cars, even though there are still pipe dreams about supercar so flexible it can navigate any terrain.
The issue might be that the West is delaying building the road infrastructure waiting for that AGI supercar to happen.
(Military might care about the latter though, but in reality, drones and quadrupeds and tanks will be a better choice.)
jqpabc123•20h ago
No amount of fantastical thinking is going to coax AGI out of a box of inanimate binary switches --- aka, a computer as we know it.
Even with billions and billions of microscopic switches operating at extremely high speed consuming an enormous share of the world's energy, a computer will still be nothing more than a binary logic playback device.
Expecting anything more is to defy logic and physics and just assume that "intelligence" is a binary algorithm.
NuclearPM•20h ago
antonvs•20h ago
jqpabc123•19h ago
What do you believe?
antonvs•17h ago
We already have irrefutable evidence of what can reasonably be called intelligence, from a functional perspective, from these models. In fact in many, many respects, the models outperform a majority of humans on many kinds of tasks requiring intelligence. Coding-related tasks are an especially good example.
Of course, they're not equivalent to humans in all respects, but there's no reason that should be a requirement for intelligence.
If anything, the onus lies on you to clarify what you think can't be achieved by these models, in principle.
soulofmischief•16h ago
Intelligence can be expressed in higher order terms than the logic that the binary gates running the underlying software is required to account for.
Quarks don't need to account for atomic physics. Atomic physics doesn't need to account for chemistry. Chemistry doesn't need to account for materials science. It goes on and on. It's easy to look at a soup of quarks and go, "there's no way this soup of quarks could support my definition of intelligence!", but you go up the chain of abstraction and suddenly you've got a brain.
Scientists don't even understand yet where subjective consciousness comes into the picture. There are so many unanswered questions that it's preposterous to claim you know the answers without proof that extends beyond a handwavy belief.
soulofmischief•20h ago
What logic and physics are being defied by the assumption that intelligence doesn't require the specific biological machinery we are accustomed to?
This is a ridiculous comment to make, you do nothing to actually prove the claims you're making, which are even stronger than the claims most people will make about the potential of AGI.
echelon•20h ago
Human intelligence seems likely to be a few tricks we just haven't figured out yet. Once we figure it out, we'll probably remark on how simple a model it is.
We don't have the necessary foundation to get there yet. (Background context, software/hardware ecosystem, understanding, clues from other domains, enough people spending time on it, etc.) But one day we will.
At some point people will try to run human-level AGI intelligences on their Raspberry Pi. I'd almost bet that will be a game played in the future - run human-level AGI intelligences on as low a spec machine as possible.
I also wonder what it would be like if the AGI / ASI timeline coincide with our ability to do human brain scans at higher fidelity. And that if they do line up, that we might try replicating our actual human thoughts and dreams on our future architectures as we make progress on AGI.
If those timelines have anything to do with one another, then when we crack AGI, we might also be close to "human brain uploads". I wouldn't say it's a necessary precondition, but I'd bet it would help if the timelines aligned.
And I know the limits of detection right now and in the foreseeable future are abysmal. So AGI and even ASI probably come first. But it'd be neat if they were close to parallel.
jqpabc123•20h ago
The logic and physics that make a computer what it is --- a binary logic playback device.
By design, this is all it is capable of doing.
Assuming a finite, inanimate computer can produce AGI is to assume that "intelligence" is nothing more than a binary logic algorithm. Currently, there is no logical basis for this assumption --- simply because we have yet to produce a logical definition of "intelligence".
Of all people, programmers should understand that you can't program something that is not defined.
Ukv•18h ago
Humans are also made up of a finite number of tiny particles moving around that would, on their own, not be considered living or intelligent.
> [...] we have yet to produce a logical definition of "intelligence". Of all people, programmers should understand that you can't program something that is not defined.
There are multiple definitions of intelligence, some mathematically formalized, usually centered around reasoning and adapting to new challenges.
There are also a variety of definitions for what makes an application "accessible", most not super precise, but that doesn't prevent me improving the application in ways such that it gradually meets more and more people's definitions of accessible.
soulofmischief•16h ago
What do you mean by finite, are you familiar with the halting problem? [1]
What does "inanimate" mean here? Have you seen a robot before?
Imprecise language negates your entire argument. You need to very precisely express your thoughts if you are to make such bold, fundamental claims.
While it's great that you're taking an interest in this subject, you're clearly speaking from a place of great ignorance, and it would serve you better to learn more about the things you're criticizing before making inflammatory, ill-founded claims. Especially when you start trying to tell a field expert that they don't know their own field.
Using handwavy words you don't seem to understand such as "finite" and "inanimate" while also claiming we don't have a "logical definition" (whatever that means) of intelligence just results in an incomprehensible argument.
[0] https://en.wikipedia.org/wiki/Turing_machine [1] https://en.wikipedia.org/wiki/Halting_problem
guardian5x•20h ago
jqpabc123•19h ago
Currently, "intelligence" is lacking a clear mathematical definition.
lostmsu•18h ago
On the contrary, there are many. You just don't like them. E.g. skill of prediction.
Ukv•20h ago
I generally agree with the article's point, though I think "Will Never Happen" is too strong of a conclusion, whereas I don't think the idea that simple components ("a box of inanimate binary switches") fundamentally cannot combine to produce complex behaviour is well-founded.
seanw265•20h ago
The article is about the constraints of computation, scaling of current inference architecture, and economics.
It is completely unrelated to your claim that cognition is entirely separate from computation.
AnotherGoodName•20h ago
What if i had an external source of trye randomness? Very easy to add. In fact current ai algorithms have a temperature parameter that can easily utilise true randomness if you want it to.
Would you suddenly change your mind and say ok ‘now it can be AGI!’ because i added a nuclear decay based random number generator to my ai model?
eldavojohn•20h ago
"A Novel Bridge from Randomness in Stochastic Data to, Like, OMG I'm SO Randomness in Valley Girl Entropy"
We will pay dearly for overloading that word. Good AGI will be capable of saying the most random things! Not, really, no. I mean, they'll still be pronounceable, I'm guessing?