Seems most of the people one would encounter out in the world might not posses AGI, how are we supposed to be able to train our electrified rocks to have AGI if this is the case?
If no one has created a online quiz called "Are you smarter than AGI?" yet based on the proposed "ten core cognitive domains", I'd be disappointed.
It makes me think of every single public discussion that's ever been had about quantum, where you can't start the conversation unless you go through a quick 101 on what a qubit is.
As with any technology, there's not really a destination. There is only the process of improvement. The only real definitive point is when a technology becomes obsolete, though it is still kept alive through a celebration of its nostalgia.
AI will continue to improve. More workflows will become automated. And from our perception, no matter what the rapidness of advancement is, we're still frogs in water.
It's a very emotional topic because people feel their self image threatened. It's a topic related to what is the meaning of being human. Yeah sure it should be a separate question, but emotionally it is connected to it in a deep level. The prospect of job replacement and social transformation is quite a threatening one.
So I'm somewhat understanding of this. It's not merely an academic topic, because these things will be adopted in the real world among real people. So you can't simply make everyone shut up who is an outsider or just heard about this stuff incidentally in the news and has superficial points to make.
Just like an airplane doesn't work exactly like a bird, but both can fly.
[1] https://andreinfante.substack.com/p/when-will-ai-transform-t...
Assume the Riemann hypothesis is false. Then, consider the proposition "{a|a∉a}∈{a|a∉a}". By the law of the excluded middle, it suffices to consider each case separately. Assuming {a|a∉a}∈{a|a∉a}, we find {a|a∉a}∉{a|a∉a}, for a contradiction. Instead, assuming {a|a∉a}∉{a|a∉a}, we find {a|a∉a}∈{a|a∉a}, for a contradiction. Therefore, "the Riemann hypothesis is false" is false. By the law of the excluded middle, we have shown the Riemann hypothesis is true.
Naïve AGI is an apt analogy, in this regard, but I feel these systems aren't simple nor elegant enough to deserve the name naïve.
> defining AGI as matching the cognitive versatility and proficiency of a well-educated adult.Is it about jobs/tasks, or cognitive capabilities? The majority of the AI-valley seems to focus on the former, TFA focuses on the latter.
Can it do tasks, or jobs? Jobs are bundles of tasks. AI might be able to do 90% of tasks for a given job, but not the whole job.
If tasks, what counts as a task: Is it only specific things with clear success criteria? That's easier.
Is scaffolding allowed: Does it need to be able to do the tasks/jobs without scaffolding and human-written few-shot prompts?
Today's tasks/jobs only, or does it include future ones too? As tasks and jobs get automated, jobs evolve and get re-defined. So, being able to do the future jobs too is much harder.
Remote only, or in-person too: In-person too is a much higher bar.
What threshold of tasks/jobs: "most" is apparently typically understood to mean 80-95% (Mira Ariel). Automating 80% of tasks is different to 90% and 95% and 99%. diminishing returns. And how are the tasks counted - by frequency, by dollar-weighted, by unique count of tasks?
Only economically valuable tasks/jobs, or does it include anything a human can do?
A high-order bit on many people's AGI timelines is which definition of AGI they're using, so clarifying the definition is nice.
If it does an hour of tasks, but creates an additional hour of work for the worker...
Edit due to rate-limiting, which in turn appears to be due to the inexplicable downvoting of my question: since you (JumpCrisscross) are imputing a human-like motivation to the model, it sounds like you're on the side of those who argue that AGI has already been achieved?
Lying != fallibility.
Edit: toned down the preachiness.
But maybe that's ASI. Whereas I consider chatgpt 3 to be "baby AGI". That's why it became so popular so fast.
ChatGPT became popular because it was easy to use and amusing. (LLM UX until then had been crappy.)
Not sure AGI aspirations had anything to do with uptake.
AGI was already here the day ChatGPT released: That's Peter Norvig's take too: https://www.noemamag.com/artificial-general-intelligence-is-...
To some people this is self-evident so the terms are equivalent, but it does require some extra assumptions: that the AI would spend time developing AI, that human intelligence isn't already the maximum reachable limit, and that the AGI really is an AGI capable of novel research beyond parroting from its training set.
I think those assumptions are pretty easy to grant, but to some people they're obviously true and to others they're obviously false. So depending on your views on those, AGI and ASI will or will not mean the same thing.
What I find cool about the paper is that they have gathered folks from lots of places (berkley, stanford, mit, etc). And no big4 labs. That's good imo.
tl;dr; Their definition: "AGI is an AI that can match or exceed the cognitive versatility and proficiency of a well-educated adult."
Cool. It's a definition. I doubt it will be agreed on by everyone, and I can see endless debates about just about every word in that definition. That's not gonna change. At least it's a starting point.
What I find interesting is that they specifically say it's not a benchmark, or a test set. It's a framework where they detail what should be tested, and how (with examples). They do have a "catchy" table with gpt4 vs gpt5, that I bet will be covered by every mainstream/blog/forum/etc out there -> gpt5 is at ~50% AGI. Big title. You won't believe where it was one year ago. Number 7 will shock you. And all that jazz.
Anyway, I don't think people will stop debating about AGI. And I doubt this methodology will be agreed on by everyone. At the end of the day both extremes are more ideological in nature than pragmatic. Both end want/need their view to be correct.
I enjoyed reading it. Don't think it will settle anything. And, as someone posted below, when the first model will hit 100% on their framework, we'll find new frameworks to debate about, just like we did with the turing test :)
Also, weird to see Gary Marcus and Yoshua Bengio on the same paper. Who really wrote this? Author lists are so performative now.
This paper promises to fix "the lack of a concrete definition for Artificial General Intelligence", yet it still relies on the vague notion of a "well-educated adult". That’s especially peculiar, since in many fields AI is already beyond the level of an adult.
You might say this is about "jaggedness", because AI clearly lacks quite a few skills:
> Application of this framework reveals a highly “jagged” cognitive profile in contemporary models.
But all intelligence, of any sort, is "jagged" when measured against a different set of problems or environments.
So, if that’s the case, this isn’t really a framework for AGI; it’s a framework for measuring AI along a particular set of dimensions. A more honest title might be: "A Framework for Measuring the Jaggedness of AI Against the Cattell–Horn–Carroll Theory". It wouldn't be nearly as sexy, though.
On the other hand, research on "common intelligence" AFAIK shows that most measures of different types of intelligence have a very high correlation and some (apologies, I don't know the literature) have posited that we should think about some "general common intelligence" to understand this.
The surprising thing about AI so far is how much more jagged it is wrt to human intelligence
If you go beyond human species (and well, computers are not even living organisms), it gets tricky. Adaptability (which is arguably a broader concept than intelligence) is very different for, say octopodes, corvids and slime molds.
It is certainly not a single line of proficiency or progress. Things look like lines only if we zoom a lot.
Current AI is in its infancy and we're just throwing data at it in the same way evolution throws random change at our DNA and sees what sticks.
I think people get really uncomfortable trying to even tackle that, and realistically for a huge set of AI tasks we need AI that are more intelligent than a huge subset of humans for it to be useful. But there are also a lot of tasks where AI that is not needed, and we "just" need "more human failure modes".
I do agree that it’s a weird standard though. Many of our AI implementations exceed the level of knowledge of a well-educated adult (and still underperform with that advantage in many contexts).
Personally, I don’t think defining AGI is particularly useful. It is just a marketing term. Rather, it’s more useful to just speak about features/capabilities. Shorthand for a specific set of capabilities will arise naturally.
It's a similar debate with self driving cars. They already drive better than most people in most situations (some humans crash and can't drive in the snow either for example).
Ultimately, defining AGI seems like a fools errand. At some point the AI will be good enough to do the tasks that some humans do (it already is!). That's all that really matters here.
What matters to me is, if the "AGI" can reliably solve the tasks that I give to it and that requires also reliable learning.
LLM's are far from that. It takes special human AGI to train them to make progress.
How many humans do you know that can do that?
[1]: The capability to continually learn new information (associative, meaningful, and verbatim). (from the publication)
Cattell-Horn-Carroll theory, like a lot of psychometric research, is based on collecting a lot of data and running factor analysis (or similar) to look for axes that seem orthogonal.
It's not clear that the axes are necessary or sufficient to define intelligence, especially if the goal is to define intelligence that applies to non-humans.
For example reading and writing ability and visual processing imply the organism has light sensors, which it may not. Do all intelligent beings have vision? I don't see an obvious reason why they would.
Whatever definition you use for AGI probably shouldn't depend heavily on having analyzed human-specific data for the same reason that your definition of what counts as music shouldn't depend entirely on inferences from a single genre.
Things like chess-playing skill of a machine could be bench-marked against that of a human, but the abstract feelings that drive reasoning and correlations inside a human mind are more biological than logical.
Interestingly the people doing the actual envelope pushing in this domain, such as Ilya Sutskever, think that there it’s a scaling problem, and neural nets do result in AGIs eventually, but I haven’t heard them substantiate it.
(I'm asking because of your statement, "Don’t fool yourself into believing artificial intelligence is not one breakthrough away", which I'm not sure I understand, but if I am parsing it correctly, I question your basis for saying it.)
“one breakthrough away” as in some breakthrough away
I find anyone with confident answers to questions like these immediately suspect.
There is reason to believe that consciousness, sentience, or emotions require a biological base.
Or
There is no reason to believe that consciousness, sentience, or emotions do not require a biological base.
The first is simple, if there is a reason you can ask for it and evaluate it's merits. Quantum stuff is often pointed to here, but the reasoning is unconvincing.
The second form There is no reason to believe P does not require Q.
There are no proven reasons but there are suspected reasons. For instance if the operation that nerons perform is what makes consciousness work, and that operation can be reproduced non-biologicLly it would follow that non biological consciousness would be possible.
For any observable phenomenon in the brain the same thing can be asked. So far it seems reasonable to expect most of the observable processes could be replicated.
None of it acts as proof, but they probably rise to the bar of reasons.
My emotions are definitely a function of the chemical soup my brain is sitting in (or the opposite).
Let me pose back to you a related question as my answer: How do you know that I feel emotions rather than merely emulating emotional behavior?
This gets into the philosophy of knowing anything at all. Descartes would say that you can't. So we acknowledge the limitation and do our best to build functional models that help us do things other than wallow in existential loneliness.
But you can propose explanations and try to falsify them. I haven’t thought about it but maybe there is a way to construct an experiment to falsify the claim that you don’t feel emotions.
Preface:
The problem of the relation between our bodies and our minds, and especially of the link between brain structures and processes on the one hand and mental dispositions and events on the other is an exceedingly difficult one. Without pretending to be able to foresee future developments, both authors of this book think it improbable that the problem will ever be solved, in the sense that we shall really understand this relation. We think that no more can be expected than to make a little progress here or there.
... well. Thanks a bunch, Karl.
Also, you don't know what species I am. Maybe I'm a dog. :-)
(https://en.wikipedia.org/wiki/On_the_Internet,_nobody_knows_...)
1) I know that I have emotions because I experience them.
2) I know that you and I are very similar because we are both human.
3) I know that we can observe changes in the brain as a result of our changing emotions and that changes to our brains can affect our emotions.
I thus have good reason to believe that since I experience emotions and that we are both human, you experience emotions too.
The alternative explanation, that you are otherwise human and display all the hallmarks of having emotions but do not in fact experience anything (the P-zombie hypothesis), is an extraordinary claim that has no evidence to support it and not even a plausible, hypothetical mechanism of action.
With an emotional machine I see no immediately obvious even hypothetical evidence to lend support to its veracity. In light of all this, it seems extraordinary to claim that non-biological means achieving real emotions (not emulated emotions) are possible.
After all, emulated emotions have already been demonstrated in video games. To call those sufficient would be setting an extremely low bar.
How does a computer with full AGI experience the feeling of butterflies in your stomach when your first love is required?
How does a computer experience the tightening of your chest when you have a panic attack?
How does a computer experience the effects of chemicals like adrenaline or dopamine?
The A in AGI stands for “artificial” for good reason, IMO. A computer system can understand these concepts by description or recognize some of them them by computer vision, audio, or other sensors, but it seems as though it will always lack sufficient biological context to experience true consciousness.
Perhaps humans are just biological computers, but the “biological” part could be the most important part of that equation.
there are many parts of human cognition, phycology etc. especially related to consciousness that are known unknowns and/or completely unknown.
a mitigation for this isaue would be to call it generally applicable intelligence or something, rather than human like intelligence. implying ita not specialized AI but also not human like. (i dont see why it would need to be human like, because even with all the right logic and intelligence a human can still do something counter to all of that. humans do this everyday. intuitive action, or irrational action etc.
what we want is generally applicable intelligence, not human like intelligence.
We only have one good example of consciousness and sentience, and that is our own. We have good reason to suspect other entities (particularly other human individuals, but also other animals) have that as well, but we cannot access it, and not even confirm its existence. As a result using these terms of non-human beings becomes confusing at best, but it will never be actually helpful.
Emotions are another thing, we can define that outside of our experience, using behavior states and its connection with patterns of stimuli. For that we can certainly observe and describe behavior of a non biological entity as emotional. But given that emotion is something which regulates behavior which has evolved over millions of years, whether such a description would be useful is a whole another matter. I would be inclined to use a more general description of behavior patterns which includes emotion but also other means of behavior regulators.
This is bad definition, because human baby is already AGI when it's born and it's brain is empty. AGI is the blank slate and ability to learn anything.
We are born with inherited "data" - innate behaviors, basic pattern recognition, etc. Some even claim that we're born with basic physics toolkit (things are generally solid, they move). We then build on that by being imitators, amassing new skills and methods simply by observation and performing search.
I don't think people really realize how extraordinary accomplishment it would be to have an artificial system matching the cognitive versatility and proficiency of an uneducated child, much less a well-educated adult. Hell, AI matching the intelligence of some nonhuman animals would be an epoch-defining accomplishment.
Edit: Probably not, since it was published less than a week ago :-) I’ll be watching for benchmarks.
Try to reconcile that with your ideas (that I think are correct for that matter)
This is because I use "stupidity" as the number of examples some intelligence needs in order to learn from, while performance is limited to the quality of the output.
LLMs *partially* make up for being too stupid to live (literally: no living thing could survive if it needed so many examples) by going through each example faster than any living thing ever could — by as many orders of magnitude as there are between jogging and continental drift.
LLMs have a reasonable learning rate at inference time (in-context learning is powerful), but a very poor learning rate in pretraining. And one issue with that is that we have an awful lot of cheap data to pretrain those LLMs with.
We don't know how much compute human brain uses to do what it does. And if we could pretrain with the same data-efficiency as humans, but at the cost of using x10000 the compute for it?
It would be impossible to justify doing that for all but the most expensive, hard-to-come-by gold-plated datasets - ones that are actually worth squeezing every drop of performance gains out from.
The Turing Test was great until something that passed it (with an average human as interrogator) turned out to also not be able to count letters in a word — because only a special kind of human interrogator (the "scientist or QA" kind) could even think to ask that kind of question.
This paper, for example, uses the 'dual N-back test' as part of its evaluation. In humans this relates to variation in our ability to use working memory, which in humans relates to 'g'; but it seems pretty meaningless when applied to transformers -- because the task itself has nothing intrinsically to do with intelligence, and of course 'dual N-back' should be easy for transformers -- they should have complete recall over their large context window.
Human intelligence tests are designed to measure variation in human intelligence -- it's silly to take those same isolated benchmarks and pretend they mean the same thing when applied to machines. Obviously a machine doing well on an IQ test doesn't mean that it will be able to do what a high IQ person could do in the messy real world; it's a benchmark, and it's only a meaningful benchmark because in humans IQ measures are designed to correlate with long-term outcomes and abilities.
That is, in humans, performance on these isolated benchmarks is correlated with our ability to exist in the messy real-world, but for AI, that correlation doesn't exist -- because the tests weren't designed to measure 'intelligence' per se, but human intelligence in the context of human lives.
an entity which is better than any human at any task.
Fight me!
vardump•3h ago
MattRix•3h ago
krige•3h ago
A4ET8a8uTh0_v2•3h ago
vardump•3h ago
At first, just playing chess was considered to be a sign of intelligence. Of course, that was wrong, but not obvious at all in 1950.
NitpickLawyer•2h ago
When deepmind was founded (2010) their definition was the following: AI is a system that learns to perform one thing; AGI is a system that learns to perform many things at the same time.
I would say that whatever we have today, "as a system" matches that definition. In other words, the "system" that is say gpt5/gemini3/etc has learned to "do" (while do is debateable) a lot of tasks (read/write/play chess/code/etc) "at the same time". And from a "pure" ML point, it learned those things from the "simple" core objective of next token prediction (+ enhancements later, RL, etc). That is pretty cool.
So I can see that as an argument for "yes".
But, even the person who had that definition has "moved the goalposts" of his own definition. From recent interviews, Hassabis has moved towards a definition that resembles the one from this paper linked here. So there's that. We are all moving the goalposts.
And it's not a recent thing. People did this back in the 80s. There's the famous "As soon as AI does something, it ceases to be AI" or paraphrased "AI is everything that hasn't been done yet".
wahnfrieden•2h ago
NitpickLawyer•2h ago
----
In 2010, one of the first "presentations" given at Deepmind by Hassabis, had a few slides on AGI (from the movie/documentary "The Thinking Game"):
Quote from Shane Legg: "Our mission was to build an AGI - an artificial general intelligence, and so that means that we need a system which is general - it doesn't learn to do one specific thing. That's really key part of human intelligence, learn to do many many things".
Quote from Hassabis: "So, what is our mission? We summarise it as <Build the world's first general learning machine>. So we always stress the word general and learning here the key things."
And the key slide (that I think cements the difference between what AGI stood for then, vs. now):
AI - one task vs. AGI - many tasks
at human level intelligence.
darepublic•2h ago
NitpickLawyer•2h ago
For reference, the average chesscom player is ~900 elo, while the average FIDE rated player is ~1600. So, yeah. Parrot or not, the LLMs can make moves above the average player. Whatever that means.
darepublic•2h ago
NitpickLawyer•2h ago
bossyTeacher•2h ago
What counts as a "thing"? Because arguably some of the deep ANNs pre-transfomers would also qualify as AGI but no one would consider them intelligent (not in the human or animal sense of intelligence).
And you probably don't even need fancy neural networks. Get a RL algorithm and a properly mapped solution space and it will learn to do whatever you want as long as the problem can be mapped.
derektank•2h ago
jltsiren•1h ago
When I was in college ~25 years ago, I took a class on the philosophy of AI. People had come up with a lot of weird ideas about AI, but there was one almost universal conclusion: that the Turing test is not a good test for intelligence.
The least weird objection was that the premise of the Turing test is unscientific. It sees "this system is intelligent" as a logical statement and seeks to prove or disprove it in an abstract model. But if you perform an experiment to determine if a real-world system is intelligent, the right conclusion for the system passing the test is that the system may be intelligent, but a different experiment might show that it's not.
nativeit•48m ago
rafram•3h ago
moffkalast•2h ago
bigyabai•3h ago
> we ground our methodology in Cattell-Horn-Carroll theory, the most empirically validated model of human cognition.
righthand•3h ago
kelseyfrog•2h ago
I can't begin to count the number of times I've encountered someone who holds an ontological belief for why AGI cannot exist and then for some reason formulates it as a behavioralist criteria. This muddying of argument results in what looks like a moving of the goalposts. I'd encourage folks to be more clear whether they believe AGI is ontologically possible or impossible in addition to any behavioralist claims.
zahlman•2h ago
The "Turing test" I always saw described in literature, and the examples of what passing output from a machine was imagined to look like, are nothing like what's claimed to pass nowadays. Honestly, a lot of the people claiming that contemporary chatbots pass come across like they would have thought ELIZA passed.
bonoboTP•2h ago
tsimionescu•2h ago
With today's chat bots, it's absolutely trivial to tell that you're not talking to a real human. They will never interrupt you, continue their train of thought even thought you're trying to change the conversation, go on a complete non-sequitur, swear at you, etc. These are all things that the human "controls" should be doing to prove to the judges that they are indeed human.
LLMs are nowhere near beating the Turing test. They may fool some humans in some limited interactions, especially if the output is curated by a human. But left alone to interact with the raw output for more than a few lines, and if actively seeking to tell if you're interacting with a human or an AI (instead of wanting to believe), there really is no chance you'd be tricked.
bonoboTP•1h ago
So in that sense it's a triviality. You can ask ChatGPT whether it's human and it will say no upfront. And it has various guardrails in place against too much "roleplay", so you can't just instruct it to act human. You'd need a different post-training setup.
I'm not aware whether anyone did that with open models already.
tsimionescu•1h ago
og_kalu•47m ago
Post training them to speak like a bot and deny being human has no effect on how useful they are. That's just an Open AI/Google/Anthropic preference.
>If you take the raw model, it will actually be much worse at the kinds of tasks you want it to perform
Raw models are not worse. Literally every model release paper that compares both show them as better at benchmarks, if anything. Post training degrading performance is a well known phenomena. What they are is more difficult to guide/control. Raw models are less useful because you have to present your input in certain ways, but they are not worse performers.
It's besides the point anyways because again, you don't have to post train them to act as anything other than a human.
>If their behavior needs to be restricted to actually become good at specific tasks, then they can't also be claimed to pass the Turing test if they can't within those same restrictions.
Okay, but that's not the case.
tsimionescu•37m ago
This is exactly what I was referring to.
og_kalu•4m ago
zahlman•1h ago
But that is exactly the point of the Turing test.
bonoboTP•33m ago
If someone really wants to see a Turing-passing bot, I guess someone could try making one but I'm doubtful it would be of much use.
Anyways,people forget that the thought experiment by Turing was a rhetorical device, not something he envisioned to build. The point was to say that semantic debates about "intelligence" are distractions.