1) LLMs have failed to live up to the hype.
Maybe. Depends upon's who's hype. But I think it is fine to say that we don't have AGI today (however that is defined) and that some people hyped that up.
2) LLMs haven't failed outright
I think that this is a vast understatement.
LLMs have been a wild success. At big tech over 40% of checked in code is LLM generated. At smaller companies the proportion is larger. ChatGPT has over 800 million weekly active users.
Students throughout the world, and especially in the developed world are using "AI" at 85-90% (from some surveys).
Between 40% of professionals and 90% (depending upon survey and profession) are using "AI".
This is 3 years after the launch of ChatGPT (and the capabilities of chatGPT 3.5 were so limited compared to today that it is a shame that they get bundled together in our discussions). I would say instead of "failed outright" that they are the most successful consumer product of all time (so far).
I have a really hard time believing that stat without any context, is there a source for this?
but they do that while making the codebase substantially worse for the next person or LLM. large code size, inconsistent behavior, duplicates of duplicates of duplicates strewn everywhere with little to no pattern so you might have to fix something a dozen times in a dozen ways for a dozen reasons before it actually works, nothing handles it efficiently.
the only thing that matters in a business is value produced, and I'm far from convinced that they're even break-even if they were free in most cases. they're burning the future with tech debt, on the hopes that it will be able to handle it where humans cannot, which does not seem true at all to me.
Hopefully one of the major companies will release a comprehensive report to the public, but they seem to guard these metrics.
Assuming this is true though, how much of that 40% is boilerplate or simple, low effort code that could have been knocked out in a few minutes previously? It's always been the case that 10% of the code is particularly thorny and takes 80% of the time, or whatever.
Not to discount your overall point, LLMs are definitely a technical success.
Really? I derive a ton of value from it. For me it’s a phenomenal advancement and not a failure at all.
I've been programming for 30+ years and now a people manager. Claude Code has enabled me to code again and I'm several times more productive than I ever was as an IC in the 2000s and 2010s. I suspect this person hasn't really tried the most recent generation, it is quite impressive and works very well if you do know what you are doing
"it still requires genuine expertise to spot the hallucinations"
"works very well if you do know what you are doing"
For example, build a TUI or GUI with Claude Code while only giving it feedback on the UX/QA side. I've done it many times despite 20 years of software experience. -- Some stuff just doesn't justify me spending my time credentializing in the impl.
Hallucinations that lead to code that doesn't work just get fixed. Most code I write isn't like "now write an accurate technical essay about hamsters" where hallucinations can sneak through lest I scrutinize it; rather the code would just fail to work and trigger the LLM's feedback loop to fix it when it tries to run/lint/compile/typecheck it.
But the idea that you can only build with LLMs if you have a software engineer copilot isn't true and inches further away from true every month, so it kinda sounds like a convenient lie we tell ourselves as engineers (and understandably so: it's scary).
If you know what you are doing it works kind of mid. You see how anything more then a prototype will create lots of issues in the long run.
Dunning-Kruger effect in action.
You have decades upon decades of experience on how to approach software development and solve problems. You know the right questions to ask.
The actual non-programmers I see on Reddit are having discussions about topics such as “I don’t believe that technical debt is a real thing” and “how can I go back in time if Claude Code destroyed my code”.
That's the issue. AI coding agents are only as good as the dev behind the prompt. It works for you because you have an actual background in software engineering of which coding is just one part of the process. AI coding agents can't save the inexperienced from themselves. It just helps amateurs shoot themselves in the foot faster while convincing them they're a marksman.
The author seems to have a bias. The truth is that we _do not know_ what is going to happen. It's still too early to judge the economic impact of current technology - companies need time to understand how to use this technology. And, research is still making progress. Scaling of the current paradigms (e.g. reasoning RL) could make the technology more useful/reliable. The enormous amount of investment could yield further breakthroughs. Or.. not! Given the uncertainty, one should be both appropriately invested and diversified.
This isn't just a critique of anecdotes: I've noticed that LLMs are specifically good at convincing people of an "overly optimistic" (sometimes bordering on delusional) understanding of the quality of work they are producing.
The current AI hype is fueled by public markets, and as they found out during the pandemic, the first one to blink and acknowledge the elephant in the room loses, bigly.
So, even in the face of a devastating demonstration of "AI" ineffectiveness (which I personally haven't seen, despite things being, well, entirely underwhelming), we may very well stuck in this cycle for a while yet...
Lol someone doesn't understand how the power structure system works "the golden rule". There is a saying if you owe the bank 100k you have a problem. If you owe the bank ten million the bank has a problem. OpenAI and the other players have made this bubble so big that there is no way the power system will allow themselves to take the hit. Expect some sort of tax subsided bailout in the near future.
But there is so much real economic value being created - not speculation, but actual business processes - billions of dollars - it’s hard to seriously defend the claim that LLMs are “failures” in any practical sense.
Doesn’t mean we aren’t headed for a winter of sobering reality… but it doesn’t invalidate the disruption either.
Is there really a clear-cut distinction between the two in today's VC and acquisition based economy?
"We just cured cancer! All cancer! With a simple pill!"
"But you promised it would rejuvenate everyone to the metabolism of a 20 year old and make us biologically immortal!"
New headline: "After spending billions, project to achieve immortality has little to show..."
The argument that computational complexity has something to do with this could have merit but the article certainly doesn’t give indication as to why. Is the brain NP complete? Maybe maybe not. I could see many arguments about why modern research will fail to create AGI but just hand waving “reality is NP-hard” is not enough.
The fact is: something fundamental has changed that enables a computer to pretty effectively understand natural language. That’s a discovery on the scale of the internet or google search and shouldn’t be discounted… and usage proves it. In 2 years there is a platform with billions of users. On top of that huge fields of new research are making leaps and bounds with novel methods utilizing AI for chemistry, computational geometry, biology etc.
It’s a paradigm shift.
You understand how the tech works right? It's statistics and tokens. The computer understands nothing. Creating "understanding" would be a breakthrough.
Edit: I wasn't trying to be a jerk. I sincerely wasn't. I don't "understand" how LLMs "understand" anything. I'd be super pumped to learn that bit. I don't have an agenda.
I would say that, except for the observable and testable performance, what else can you say about understanding?
It is a fact that LLMs are getting better at many tasks. From their performance, they seem to have an understanding of say python.
The mechanistic way this understanding arises is different than humans.
How can you say then it is 'not real', without invoking the hard problem of consciousness, at which point, we've hit a completely open question.
I’m always struck by how confidently people assert stuff like this, as if the fact that we can easily comprehend the low-level structure somehow invalidates the reality of the higher-level structures. As if we know concretely that the human mind is something other than emergent complexity arising from simpler mechanics.
I’m not necessarily saying these machines are “thinking”. I wish I could say for sure that they’re not, but that would be dishonest: I feel like they aren’t thinking, but I have no evidence to back that up, and I haven’t seen non-self-referential evidence from anyone else.
You don’t know how your own mind “understands” something. No one on the planet can even describe how human understanding works.
Yes, LLMs are vast statistical engines but that doesn’t mean something interesting isn’t going on.
At this point I’d argue that humans “hallucinate” and/or provide wrong answers far more often than SOTA LLMs.
I expect to see responses like yours on Reddit, not HN.
I suppose that says something about both of us.
LLMs activate similar neurons for similar concepts not only across languages, but also across input types. I’d like to know if you’d consider that as a good representation of “understanding” and if not, how would you define it?
Understand just means “parse language” and is highly subjective. If I talk to someone African in Chinese they do not understand me but they are still conscious.
If I talk to an LLM in Chinese it will understand me but that doesn’t mean it is conscious.
If I talk about physics to a kindergartner they will not understand but that doesn’t mean they don’t understand anything.
Do you see where I am going?
OP says it is because that predicting the next token can be correct or not, but it always looks plausible because that is what it calculates. Therefore it is dangerous and can not be fixed because it is how it works in essence.
Literally yesterday ChatGPT hallucinated an entire feature of a mod for a video game I am playing including making up a fake console command.
It just straight up doesn’t exist, it just seemed like a relatively plausible thing to exist.
This is still happening. It never stopped happening. I don’t even see a real slowdown in how often it happens.
It sometimes feels like the only thing saving LLMs are when they’re forced to tap into a better system like running a search engine query.
The response to your query might not be what you needed, similar to interacting with an RDBMS and mistyping a table name and getting data from another table or misremembering which tables exist and getting an error. We would not call such faults "hallucinations", and shouldn't when the database is a pile of eldritch vectors either. If we persist in doing so we'll teach other people to develop dangerous and absurd expectations.
A steel man argument for why winter might be coming is all the dumb stuff companies are pushing AI for. On one hand (and I believe this) we argue it’s the most consequential technology in generations. On the other, everybody is using it for nonsense like helping you write an email that makes you sound like an empty suit, or providing a summary you didn’t ask for.
There’s still a ton of product work to cross whatever that valleys called between concept and product, and if that doesn’t happen, money is going to start disappearing. The valuation isn’t justified by the dumb stuff we do with it, it needs PMF.
What we didn't get was what had been expected, namely things like expert systems that were actual experts, so called 'general intelligence' and war waged through 'blackboard systems'.
We've had voice controlled electronics for a long time. On the other hand, machine vision applications have improved massively in certain niches, and also allowed for new forms of intense tyranny and surveillance where errors are actually considered a feature rather than a bug since they erode civil liberties and human rights but are still broadly accepted because 'computer says'.
While you could likely argue "leaps and bounds with novel methods utilizing AI for chemistry, computational geometry, biology etc." by downplaying the first part or clarifying that it is mainly an expectation, I think most people are going to, for the foreseeable future, keep seeing "AI" as more or less synonymous with synthetic infantile chatbot personalities that substitute for human contact.
Where the current wave all falls apart is on the financials. None of that makes any sense and there’s no obvious path forward.
Folks say handwavy things like “oh they’ll just sell ads” but even a cursory analysis shows that math doesn’t ad up relative to the sums of money being invested at the moment.
Tech wise I’m bullish. Business wise, AI is setting up to be a big disaster. Those that aimlessly chased the hype are heading for a world of financial pain.
Ok, so I think there's 2 things here that people get mixed on.
First, Inference of the current state of the art is Cheap now. There's no 2 ways about it. Statements from Google, Altman as well as costs of 3rd parties selling tokens of top tier open source models paint a pretty good picture. Ads would be enough to make Open AI a profitable company selling current SOTA LLMs to consumers.
Here's the other thing that mixes things up. Right now, Open AI is not just trying to be 'a profitable company'. They're not just trying to stay where they are and build a regular business off it. They are trying to build and serve 'AGI', or as they define it, 'highly autonomous systems that outperform humans at most economically valuable work'. They believe that, to build and serve this machine to hundreds of millions would require costs order(s) of magnitudes greater.
In service of that purpose is where all the 'insane' levels of money is moving to. They don't need hundreds of billions of dollars in data centers to stay afloat or be profitable.
If they manage to build this machine, then those costs don't matter, and if things are not working out midway, they can just drop the quest. They will still have an insanely useful product that is already used by hundreds of millions every week, as well as the margins and unit economics to actually make money off of it.
The problem is they have real competition now and that market now looks like an expensive race to an undifferentiated bottom.
If someone truly invents AGI and it’s not easily copied by others then I agree it’s a whole new ballgame.
The reality is that years into this we seem to be hitting a limit to what LLMs can do with only marginal improvements with each release. On that path this get ugly fast.
We should factor in that messaging that's seamless and undisclosed in conversational LLM output will be a lot more valuable that what we think of as advertising today.
People have figured it out by now. Generative "AI" will fail, other forms may continue, though it it would be interesting to hear from experts in other fields how much fraud there is. There are tons of material science "AI" startups, it is hard to believe they all deliver.
Context: I have been writing software for 30 years. I taught myself assembly language and hacked games/apps as a kid, and have been a professional developer for 20 years. I’m not a noob.
I’m currently building a real-time research and alerting side project using a little army of assistant AI developers. Given a choice, I would never go back to how I developed software before this. That isn’t my mind poisoned by hype and marketing.
However, there is a real risk that AI stocks will crash and pull the entire market down, just like it happened in 2000 with the dotcom bubble. But did we see an internet or dotcom winter after 2000? No, everybody kept using the Internet, Windows, Amazon, Ebay, Facebook and all the other "useless crap". Only the stock market froze over for a few years and previously overhyped companies had a hard time, but given the exaggeration before 2000 this was not really a surprise.
What will happen is that the hype train will stop or slow down, and you will not longer get thousands, millions, billions, or trillions in funding just because you slap "AI" to your otherwise worthless project. If you are currently working on such a project, enjoy your time while it lasts. And rest assured that it will not last forever.
But that's me being a sucker. Because in reality this is just a clickbait headline for an article basically saying that the tech won't fully get us to AGI and that the bubble will likely pop and only a few players will remain. Which I completely agree with. It's really not that profound.
I think I will keep using it while it's cheap, but once I have to pay the real costs of training/running a flagship modell I think I will quit. It's too expensive as it is for what it does.
The reason is hype deflation and technical stagnation don't have to arrive together. Once people stop promising AGI by Christmas and clamp down on infinite growth + infinite GPU spend, things will start to look more normal.
At this point, it feels more like the financing story was the shaky part not the tech or the workflows. LLMs’ve changed workflows in a way that’s very hard to unwind now.
for example, fictional stories. If you want to be entertained and it doesn’t matter if it’s true or not, there’s no downsides to “hallucinations”. you could argue that stories ARE hallucinations.
another example is advertisements. what matters is how people perceive them, not what’s actually true.
or, content for a political campaign.
the more i think about it, genAI really is a perfect match for social media companies
Winters are when technology falls out of the vice grip of Capital and into the hands of the everyman.
Winters are when you’ll see folks abandon this AIaaS model for every conceivable use case, and start shifting processing power back to the end user.
Winters ensure only the strongest survive into the next Spring. They’re consequences for hubris (“LLMs will replace all the jobs”) that give space for new things to emerge.
So, yeah, I’m looking forward to another AI winter, because that’s when we finally see what does and does not work. My personal guess is that agents and programming-assistants will be more tightly integrated into some local IDEs instead of pricey software subscriptions, foundational models won’t be trained nearly as often, and some accessibility interfaces will see improvement from the language processing capabilities of LLMs (real-time translation, as an example, or speech-to-action).
That, I’m looking forward to. AI in the hands of the common man, not locked behind subscription paywalls, advertising slop, or VC Capital.
Yes, there is hype.
But if you actually filter it out, instead of (over) reacting to it in either direction, progress has been phenomenal and the fact there is visible progress in many areas, including LLMs, in the order of months demonstrates no walls.
Visible progress doesn’t mean astounding progress. But any tech that is improving year to year is moving at a good speed.
Huge apparent leaps in recent years seem to have spoiled some people. Or perhaps desensitized them. Or perhaps, created frustration that big leaps don’t happen every week.
I can’t fathom anyone not using models for 1000 things. But we all operate differently, and have different kinds of lives, work and problems. So I take claims that individuals are not getting much from models at face value.
But that some people are not finding the value isn’t an argument that those of us getting value, increasing value isn’t real.
deadbabe•45m ago
What we should underscore though, is that even if there is a new AI winter, the world isn’t going back to what it was before AI. This is it, forever.
Generations ahead will gaslight themselves into thinking this AI world is better, because who wants to grow up knowing they live in a shitty era full of slop? Don’t believe it.
7thaccount•36m ago
I think we'll continue to see anything be automated that can be automated in a way that reduces head count. So you have the dumb AI as a first line of defense and lay off half the customer service you had before.
In the meantime, fewer and fewer jobs (especially entry level), a rising poor class as the middle class is eliminated and a greater wealth gap than ever before. The markets are going to also collapse from this AI bubble. It's just a matter of when.
cardanome•30m ago
It could very well that the current generation of AI has poisoned the well for any future endeavors of creating AI. You can't trivially filter out the AI slop and humans are less likely to make their handcrafted content freely available for training. In fact violating GPL code to train models on it might be ruled to be illegal as well generally stricter rules on which data you are allowed to use for training.
We might have reached a local optimum that is very difficult to escape from. There might be a long, long AI winter ahead of us, for better or worse.
> the world isn’t going back to what it was before AI. This is it, forever.
I feel this so much. I though my longing for the pre-smartphone days was bad but damn we have lost so much.