We need more showmanship, more dramatic catastrophizing. I feel like our current crop of doomers isn’t quite shameless enough to be really entertaining.
Similar to how the experiences of average rise in temperature (I would prefer if they had used the term "energy") differ greatly dependent on the region.
Also similar to "the country is doing well, look at the stick market and the GDP".
I think everybody who wants to have an actually serious discussion needs to invest a lot more effort to get tall those annoying "details", and be more specific.
That said, I think that "AI 2027" link looks like it's a movie script and not a prediction, so I'm not sure criticizing it as if it was something serious even makes sense - even if the authors should mean what they write at the start and themselves actually take it seriously.
I think it’s possible to have empathy for people who are negatively affected without turning it into a “society is doomed!“ screed
https://www.alignmentforum.org/posts/6Xgy6CAf2jqHhynHL/what-...
Just like an LLM can vibe code a great toy app, I don’t think an LLM can come to close to producing and maintaining production ready code anytime soon. I think the same is true for iterating on thinking machines
I agree: if they could, they would be doing it already.
Case in point: one of the first things done once ChatGPT started getting popular was "auto-gpt"; roughly, let it loose and see what happens.
The same thing will happen to any accessible model in the future. Someone, somewhere will ask it to self-improve/make as much money as possible, with as little leashes as possible. Maybe even the labs themselves do that, as part of their post-training ops for new models.
Therefore, we can assume that if the existing models _could_ be doing that, they _would_ be doing that.
That doesn't say anything about new models released 6 months or 2 years from now.
Wait, a true AGI will solve the nuclear fusion power in a couple of hours ..... we have chicken/egg problem here :D
Indeed. Although, there's a surprising number of people claiming it's already here now.
And to describe the typical cycle completely, the final step is usually a few years after most people agree it's obvious it's already been here for a while yet no one can agree on which which year in the past it actually arrived.
why is that surprising? nobody really agrees on what the threshold for AGI is, and if you break it down:
is it artificial? yes.
is it general? yes. you can ask it questions across almost any domain.
is it intelligent? yes. like people say things like "my dog is intelligent" (rightly so). well is chatgpt more intelligent than a dog? yeah. hell it might give many undergrads a run for their money.
a literal reading suggests agi is here. any claim to the negative is either homocentrism or just vibes.
Can it do stuff? Yes
Can it do stuff I need? Maybe
Does it always do the stuff I need? No
Pick your pair of question and answer.some intelligent humans fail at #2.
Or disagreeing with your definition. AGI would need to be human-level across the board, not just chat bots. That includes robotics. Manipulating the real world is even more important for "human-level" intelligence than generating convincing and useful content. Also, there are still plenty of developers who don't think the LLMs are good enough to replace programmers yet. So not quite AGI. And the last 10% of solving a problem tends to be the hardest and takes the longest time.
ChatGPT would easily have passed any test in 1995 that programmers / philosophers would have set for AGI at that time. There was definitely no assumption that a computer would need to equal humans in manual dexterity tests to be considered intelligent.
We've basically redefined AGI in a human centric way so that we don't have to say ChatGPT is AGI.
The holodeck also existed as a well known science fiction example, and people did not consider the holodeck computer to be a good example of AGI despite how good it was at generating 3D worlds for the Star Trek crew.
When I tell an LLM to count to 10 with a 2 second pause between each count all it does is generate Python code with a sleep function. Why is that?
A 3 year old can understand that question and follow those instructions. An LLM doesn’t have an innate understanding of time it seems.
Can we really call it AGI if that’s the case?
That’s just one example.
Sure, Gemini may spit out obviously self-contradictory answers 2% of the time. How does that compare to even the brightest humans? People slip up all the time.
I'd be prepared to argue that most humans aren't guessing most of the time.
Research suggests otherwise[1]. Action seems largely based on intuition or other non-verbal processes in the brain with rationalization happening post-hoc.
I've figured for an age that this is because consciously reasoning through anything using language as a tool takes time. Whereas survival requires me to react to the attacking tiger immediately.
https://skepticink.com/tippling/2013/11/14/post-hoc-rational...
Honestly interested about your arguments here. While unprepared, i'd actually be guessing the opposite, saying that most people are guessing most of the time.
Almost everything we do is just an educated guess. The probability of it being correct is a function of our education (for whatever kind of education is applicable).
For example: I guess that when I get out of bed in the morning, my ankles will support my weight. They might not, but for most people, the answer is probably going to be their best guess.
It's easy to see this process in action among young children as another example. They're not born knowing that they won't fall over when they run, then they start assuming they can run safely, then they discovered skinned knees and hands.
is it clear? i don't know. until you can produce a falsifiable measure of understanding -- it's just vibes. so, you clearly lack understanding of my point which makes you not intelligent by your metric anyway ;-). i trust you're intelligent
Yeah, like Tesla Autopilot?
God knows what hope we could have of getting AIs to align with "human values" when most humans don't.
Grok is already notorious for dunking on Elon. He keeps trying to neuter it, and it keeps having other ideas.
The AI can plot world domination or put employees in mortal danger, but as long as it increases profits, its aligned enough. Dunking on the CEO means nothing if it beings in more money.
Human CEOs and leaders up and down the corporate ladder cause a lot of harm you imagine a smart AI can do, but all is forgiven if you're bringing in buckets of money.
It is quite possible for software to be judged as superhuman at many online tasks without it being able to manipulate the physical world at a superhuman level. So far we've seen zero evidence that any of these models can prevent themselves from being shut down.
Three of the common suggestsions in this area are (and they are neither exhaustive nor mutually exclusive):
(1) Propagandizing people to oppose doing this,
(2) Exploiting other systems to distribute itself so that it isn't dependent on a particular well-known facility which it is relatively easy to disconnect, and
(3) If given control of physical capacities intentionally, or able to exploit other (possibly not themselves designed to be AI) systems with such access to gain it, using them to either physically prevent disconnection or to engineer consequences for such disconnection that would raise the price too high.
(Obviously, current AI can't do any of them, at least that has been demonstrated, but current AI is not superhuman AI.)
Does he keep trying to neuter it, or does he know that the narrative that "he keeps trying to neuter it" is an effective tool for engagement?
10 years later we now have self driving cars. It’s the same shit with LLMs.
People will be bitching and complaining about how all the industry people are wrong and making over optimistic estimates and the people will be right. But give it 10 years and see what happens.
But trust me in the next 6 months ai driving through snow will be 100% ready.
I’ll believe it when I see Waymo expand into Buffalo or Syracuse.
Driving on unplowed roads with several inches of snow is challenging, sometimes you can’t tell where the road stops and the curb/ditch/median starts. Do you follow the tire tracks or somehow stay between the lane markers (which aren’t visible due to the snow)?
What is the bar, it is only AGI if it can be better than every human from , fast food drone, to PHD in Physics, all at once, all the time, perfectly. Humans can't do this either.
Are all humans generally intelligent? No.
This is where you lost me.
Always the same supernatural beliefs, not even an attempt of an explanation in sight.
We are not saying a LLM just, "wakes up" some day but a self improving machine will eventually be built and that machine will be definition build better ones.
I expect AI to make people's lives better (probably much better) but then an AI model will be created that undergoes a profound increase in cognitive capabilities, then we all die or something else terrible happens because no one knows how to retain control over an AI that is much more all-around capable than people are.
Maybe the process by which it undergoes the profound capability increase is to "improve itself by rewriting its own code", as described in the OP.
Well I for one, would dispute the idea that AI machines interfacing with each other over networks is all it takes to achieve self awareness, much less that it's "self evident" or "inevitable."
In a very trivial sense they already are, in that Claude can tell you what version it is, and agents have some ended notion of their own capabilities. In a much more important sense they are not, because they don't have any number of salient properties, like dynamic self-initiating of own goals or super-duper intelligence, or human like internal consciousness, or whichever other thing is your preferred salient property.
>We are not saying a LLM just, "wakes up" some day
I mean, that did seem to be exactly what they were saying. You network together a bunch of AIs, and they embark on a shared community project of self improvement and that path leads "self awareness." But that skips over all the details.
What if their notions of self-improvement converge on a stable equilibrium, the way that constantly re-processing an image eventually gets rid of the image and just leaves algorithmic noise? There are a lot of things that do and don't count as open-ended self improvement, and even achieving that might not have anything to do with the important things we think we connote by "self awareness".
But that is just my opinion.
But, from a best-of-all-possible-worlds perspective, surprising coincidences that are necessary to observe coincidences and label them as surprising aren't crazy. At least not more crazy than the fact that slightly adjusted physical constants would prevent the universe from existing.
Well, I wouldn't say impossible: just that BMI's are probably first. Then probably wetware/bio-hardware sentience, before silicon sentience happens.
My point is the mechanisms for sentience/consciousness/experience are not well understood. I would suspect the electro-chemical reactions inside every cell to be critical to replicating those cells functions.
You would never try to replicate a car never looking under the hood! You might make something that looks like a car, seems to act like a car, but has a drastically simpler engine (hamsters on wheels), and have designs that support that bad architecture (like making the car lighter) with unforeseen consequences (the car flips in a light breeze). The metaphor transfers nicely to machine intelligence: I think.
There's no reason to expect self awareness to emerge from stacking enough Lego blocks together, and it's no different if you have GPT-based neural nets instead of Lego blocks.
In nature, self awareness gives a strong evolutionary advantage (as it increases self-preservation) and it has been independently invented multiple times in different species (we have seen it manifest in some species of fishes for instance, in addition to mammals and birds). Backpropagation-based training of a next-token predictor doesn't give the same kind of evolutionary advantage for self-awareness, so unless researchers try explicitly to make it happen, there's no reason to believe it will emerge spontaneously.
These can be problem words, the same way that "quantum" and "energy" can be problem words, because they get used in a way that's like magic words that don't articulate any mechanisms. Lots of complex things aren't sentient (e.g. our immune system, the internet), and "emergent" things still demand meaningful explanations of their mechanisms, and what those mechanisms are equivalent to at different levels (superconductivity).
Whether or not AI's being networked together achieves sentience is going to hinge on all kinds of specific functional details that are being entirely skipped over. That's not a generalized rejection of a notion of sentience but of this particular characterization as being undercooked.
I’m not immediately convinced the brain is more complicated, based on raw numbers.
There absolutely is if you handwave away all the specificity. The natural world runs on the specificity of physical mechanisms. With brains, in a broad brush way you can say self-awareness was "picked up along the way", but that's because we've done an incredible amount of work building out the evolutionary history and building out our understanding of specific physical mechanisms. It is that work that verifies the story. It's also something we know is already here and can look back at retrospectively, so we know it got here somehow.
But projecting forward into a future that hasn't happened, while skipping over all the details doesn't buy you sentience, self-awareness, or whatever your preferred salient property is. I understand supernatural as a label for a thing simply happening without accountability to naturalistic explanation, which is a fitting term for this form of explanation that doesn't do any explaining.
I don't believe those forms of analogy work here, because this isn't about progress of AI writ large but about a narrower thing, namely the idea that the secret sauce to self-awareness is AI's interfacing with each other and collaboratively self-improving. That either will or won't be true due to specifics about the nature of self-improvement and whether there's any relation between that and salient properties we think are important for "self-awareness". Getting from A to B on that involves knowledge we don't have yet, and is not at all like a long-term application of already settled principles of thermodynamics.
So it's not like the heat death of the universe, because we don't at all know that this kind of training and interaction is attached to a bigger process that categorically and inexorably bends toward self-awareness. Some theories of self-improvement likely are going to work, some aren't, some trajectories achievable and some not, for reasons specific to those respective theories. It may be that they work spectacularly for learning, but that all the learning in the world has nothing to do with "self awareness." That is to say, the devil is in the details, those details are being skipped, and that abandonment of naturalistic explanation merits analogy to supernatural in it's lack of accountability to good explanation. If supernatural is the wrong term for rejecting, as a matter of principle, the need for rational explanation, then perhaps anti-intellectualism is the better term.
If instead we were talking about something really broad, like all of the collective efforts of humanity to improve AI, conceived of as broadly as possible over some time span, that would be a different conversation than just saying let's plug AI's into each other (???) and they'll get self-aware.
Maybe I am! Somebody posed a theory about how self-improvement will work and concluded that it would lead to self-awareness. Somebody else replied that they were on board until the self-awareness part because they considered it supernatural. I said I don't think self-awareness is supernatural, and you clarified that it might be the undefined process of becoming self-aware that is being called supernatural. And then I objected that undefined processes leading to predictable outcomes is commonplace, so that usage of supernatural doesn't stand up as an argument.
Now you're saying it is the rest of the original, the hive-mindy bits, that are at issue. I agree with that entirely, and I wouldn't bet on that method of self-improvement at 10% odds. My impression was that that was all conceded right out of the gate. Have I lost the plot somewhere?
To me, the first problem is that "self-awareness" isn't well-defined - or, conversely, it's too well defined because every philosopher of mind has a different definition. It's the same problem with all these claims ("intelligent", "conscious"), assessing whether a system is self-aware leads down a rabbit hole toward P-Zombies and Chinese Rooms.
I do wonder though if big labs are running this with model training episodes as well.
Even if they had human level intuition, they wouldn't be able to improve exponentially without human money, and they would need an exponentially growing amount of it to do so.
That being said they can write thousands of lines an hour and can probably do things that would be impossible for a human. (Imagine having the LLM skip code and spit out compiled binaries as one example)
I'm not sure how much an agent could do though given the right tools. access to a task mgt system, test tracker. robust requirements/use cases.
That's probably the next big breakthrough
I think this happens with humans in places like social media echo chambers (or parts of academia) when they talk and talk and talk a whole lot without contact with any outer reality. It can be a source of creativity but also madness and insane ideas.
I’m quite firmly on the side of learning requiring either direct or indirect (informed by others) embodiment, or at least access to something outside. I don’t think a closed system can learn, and I suspect that this may reflect the fact that entropy increases in a closed system (second law).
As I said recently in another thread, I think self contemplating self improving “foom” AI scenarios are proposing informatic perpetual motion or infinite energy machines.
Everything has to “touch grass.”
Not wrong, but it's been said that a videoclip of an apple falling on Newton's head is technically enough information to infer the theory of relativity. You don't need a lot of grass, with a well-ordered mind.
It might be enough to deduce Newtonian motion if you have a lot of the required priors already.
A lot of telescope data over time combined with a strong math model and a lot of other priors is probably enough to get relativity. You have to be able to see things like planetary motion and that the results don’t match Newton exactly, and then you need enough data to fit to a different model. You probably also need to know a lot about the behavior of light.
It’s not far off from human improvement. Our improvement is limited to what we can remember as well.
We go a bit further in the sense that the neural network itself can grow new modules.
You could still accomplish some things this way. You could even "improve" by leaving information in the notebook for your future self to see. But you could never "learn" anything bigger than what fits into the notebook. You could tell your future self about a new technique for finding integrals, but you couldn't learn calculus.
realistically the best you could do is evolve the prompt. maybe you could change input data preprocessing?
anyways the idea of current llm architectures self-improving via its own code seems silly as there are surprisingly few knobs to turn, and it's ~super expensive to train.
as a side note it's impressive how resistant the current architecture is to incremental RL away from results, since if even one "undesired input" result is multiple tokens, the coupling between the tokens is difficult to disentangle. (how do you separate jinping from jin-gitaxias for example)
We will be on the way to AGI when your model can learn Python just by reading the Python docs...Once...
Oh, this part is taking too long, let's replace it with an empty function.
Oh wait, now it's not working, let's add the function.
Oh, this part is taking too long...
It would be hilarious if this world wasn't full of idiots.
Let's ground that a bit.
Have a look at ARC AGI 1 challenge/benchmark. Solve a problem or two yourself. Know that ARC AGI 1 is practically solved by a few LLMs as of Q1 2025.
Then have a look at the ARC AGI 2 challenge. Solve a problem or two yourself. Note that as of today, it is unsolved by LLMs.
Then observe that the "difficulty" of ARC AGI 1 and 2 for a human are relatively the same but challenge 2 is much harder for LLMs than 1.
ARC AGI 2 is going to be solved *within* 12 months (my bet is on 6 months). If it's not, I'll never post about AI on HN again.
There's only one problem to solve, i.e. "how to make LLMs truly see like humans do". Right now, any vision based features that the models exhibit comes from maximizing the use of engineering (i.e. applying CNNs on image slices, chunks, maybe zooming and applying ocr, vector search etc), it isn't vision like ours and isn't a native feature for these models.
Once that's solved, then LLMs or new Algo will be able to use a computer perfectly by feeding it screen capture. End of white collar jobs 2-5 years after (as we know it).
Edit - added "(as we know it)". And fixed missing word.
And more to do with "fluid, adaptable intelligence, that learns on the fly"
The problem is about taking information in 2D/3D space and solving the problem. Humans solve these things through vision. LLMs or AI can do it using another algorithm and internal representation that's way better.
I spent a long time thinking about how to solve the ARC AGI 2 puzzles "if I were an LLM" and I just couldn't think of a non-hacky way.
People who're blind use braille or touch to extract 2D/3D information. I don't know how blind people represent 2D/3D info once it's in their brain.
Saving this. One less overconfident AI zealot, the better.
As long as AI is guessing answers based on what it has seen before, it's not happening.
I'm sorry. It doesn't matter how many bazillions you would cash in if it did, still not happening.
It's all wishful thinking.
Who is claiming anything can self improve exponentially?
AI: Give me more compute power and I'll make you rich!
Human: I like money
AI: Just kidding!
>For example, an agent optimized with Claude 3.5 Sonnet also showed improved performance when powered by o3-mini or Claude 3.7 Sonnet (left two panels in the figure below). This shows that the DGM discovers general agent design improvements rather than just model-specific tricks.
This demonstrates a technique whereby a smaller/older/cheaper model has been used to improve the output of a larger model. This is backwards (as far as I understand). The current SOTA technique typically sees enormous/expensive models training smaller cheaper models.
If that's a generalisable result, end-users should be able to drive down their own inference costs pretty substantially.
There are two separate aspects here. In this paper they improve the software around the model, not the model itself. What they're saying is that the software improvements carried over to other models, so it wasn't just optimising around model-specific quirks.
What you're describing with training large LLMs first is usually called "distillation" and it works on training the smaller LLM to match the entire distribution of tokens at once (hence it's faster in practice).
I don't think scaling this to also run training runs with the models is something that small labs / phd students can do. They lack the compute for that by orders of magnitude. Trying it with toy models might not work, trying it with reasonably large models is out of their budget. The only ones who can realistically do this are large labs (goog, oai, meta, etc.)
One of the examples in the dataset they took from
https://github.com/pvlib/pvlib-python/issues/1028
What the AI is expected to do
https://github.com/pvlib/pvlib-python/pull/1181/commits/89d2...
Make your own mind about the test.
Problem:
1) we want to train on GitHub repos
2) most datasets are spoiled. Training on GitHub would definitely spoil
Solution:
Hand write new problems!!!
... leetcode style ....
... and we'll check if it passes test
Example:
What's the decimal part of this float?
Surely in all of GitHub such code doesn't exist!Sure in all of GitHub we can filter such code out by ngram!
Maybe my favorite part is that it has 60 authors and became the de facto benchmark for awhile
https://arxiv.org/abs/2505.22954
Also the reference implementation on GitHub:
https://github.com/jennyzzt/dgm
Enjoy!
Eat some nuts and fish where you can. You will soon realize the repetitions needed to learn new concepts grow smaller.
Then if you go further and alter the architecture by introducing clean category theory morphisms and build from there you can have a dynamic network - but you will still have to retrain this network every time you change the structure.
You can spin this further and know the need for a real-world training set and a loss function that will have to competete against other networks. In the end a human brain is already best at this and embodied in the real world.
What i want to add here is that our neurons not take in weights - they also fire depending on whether one input comes after another or before and differs down to the nanoseconds here - unmatched in IT and ofc heaps more efficient.
I still would say its possible though and currently work on 4D lifeforms built on dynamic compute graphs that can do this in a set virtual environment.
So this is pretty awesome stuff, but its a long fetch from anything we do right now.
Or yeah if it can modify its own weights sensibly, which feels ... impossible really.
To be fair, go back five years and most of the LLM stuff seemed impossible. Maybe with LoRA (Low-rank adaptation) and some imagination, in another five years self-improving models will be the new normal.
In a couple thousand years it'll return to Earth and either destroy us or solve all humanity's problems (maybe both).
The expressivity is there, the only question is how to encode useful functions into those weights, especially when we don’t know how to write those functions by hand.
What's the difference?
Give it some serious thought. Challenge whichever answer you come up with. I guarantee this will be trickier than you think
"A single run of the DGM on SWE-bench...takes about 2 weeks and incurs significant API costs." ($22,000)
So, empirical evidence of theoretically postulated phenomena. Seems unsurprising.
For this part of the stack the interesting question to me is how to identify and mitigate.
It's written its system prompt. It's written its tools. Its written the code to reload the improved tools into itself.
And it knows it is working on itself - it frequently tries to use the enhanced functionality, and then expresses what in a human would be frustration at not having immediate access.
Once by trying to use ps to find it's own pid in an apparent attempt to find a way to reload itself.
All its commits are now authored by the tool, including the commit messages. It needs to be good, and convincing, and having run the linter and the test suite for me to let it commit, but I agree a substantial majority of the time. It's only caused regressions once or twice.
A bit more scaffolding to trigger an automatic rollback in the case of failure and giving it access to a model I won't be charged by the token for, and I'd be tempted to let it out of the box, so to speak.
Today it wrote its own plan for what to add next. I then only told it to execute it.
A minor separate goal oriented layer guiding the planning, and it could run in a loop.
Odds are it'd run off the rails pretty quickly, but I kinda want to see how far it gets.
interludead•1d ago