TFA kind of did, with the '20-30 dollars per person, per month, across an organisation' quote / comment - though they didn't do the math for you.
But that range of monthly spend only needs 220-400 ish people to reach the headline figure of $100k.
Whether that's good value, who can say.
The math on the 'if costs keep rising' bit of the story would take a hefty amount of (the bad type of) oversight to get to that figure per developer, yes.
Unfortunately, people are swallowing the headline without any critical thinking.
At least they admit they have no idea what they are talking about.
The token cost stopped decaying as expected, as mentioned by the original 100,000k post on HN, and the move nowadays is towards more context to keep building functionality. The cost is just going to go up for inference. These companies might be better off splitting their focus between training and tooling, and canning all the capex/opex associated with inferrence.
Forget S/W engineers for a moment ... Every white collar worker I know, especially non technical folks, use ChatGPT all the time, and believe that is AI at this point. That demand isn't going to vanish overnight.
The counter argument is usually "They'll sell data", but I'm not sure you can double the number of trillion dollar data companies without some dilution of the market, and reach a billion devices / users without nation-state level infra.
Models get more computationally expensive as they start doing more things, and an equilibrium will be found what people are willing to pay per token.
I do expect the quality of output to increase incrementally, not exponentially, as models start using more compute. The real problem begins when companies like NVidia can’t make serious optimizations anymore, but history has proven that this seems unlikely.
Then reality set in. Costs were raised so they weren’t losing money anymore. Rideshare became more of a commodity and competitors got squeezed out as there wasn’t much room to compete and make money. Service quality went downhill. Uber is generally reliable but the quality has fallen off a cliff. My last car smelled bad and the rear axel sounded like was about to fall off the car. In most cities at the airport I just walk outside and get an old fashioned taxi at the rank vs dealing with all the nonsense regulations forcing one to walking to some remote corner of a parking garage for the “ride share pickup” zone.
GenAI is entering that pivot point. The products have plateaued. There’s pressure to stop the loss leaders and set prices to a more realistic level. Services are becoming commoditized. It’s not going away but we’re entering a period of rapid consolidation. GenAI will still be here in a few years and will be useful, but like rideshare the allure will wear old and we’ll look at these things like we do spell checkers today. Something everyone uses but ultimately boring commoditized tech where there’s not a lot of money to be made. A useful feature to add to actual products.
I do think there’s some good opportunity to shift to locally run small models, but that too will just become commoditized spell-checker level tech.
This is not an analogous situation.
Inference APIs aren’t subsidised, and I’m not sure the monthly plans are any more either. AI startups burn a huge amount of money on providing free service to drive growth. That’s something they can reduce at any time without raising costs for their customers at all. Not to mention the fact that the cost of providing inference is plummeting by several orders of magnitude.
Uber weren’t providing free service to huge numbers of people, so when they wanted to turn a profit they couldn’t reduce there and had to raise prices for their customers. And the fees they pay to drivers didn’t drop a thousandfold so it wasn’t getting vastly cheaper to provide service.
It's about how many of the users hit those limits.
You can pay Amazon or a great many other hosting providers for inference for a wide variety of models. Do you think all of these hosting providers are burning money for you, when it’s not even their model and they have no lock-in?
> It would be very impressive if Claude was making money off of their $20/month users that hit their weekly limits.
They have been adjusting their limits frequently, and those whole point of those limits is to control the cost of servicing those users.
Also:
> Unit economics of LLM APIs
> As of June 2024, OpenAI's API was very likely profitable, with surprisingly high margins. Our median estimate for gross margin (not including model training costs or employee salaries) was 75%.
> Once all traffic switches over to the new August GPT-4o model and pricing, OpenAI plausibly still will have a healthy profit margin. Our median estimate for the profit margin is 55%.
— https://www.lesswrong.com/posts/SJESBW9ezhT663Sjd/unit-econo...
And more discussion on Hacker News here:
> Once all traffic switches over to the new August GPT-4o model and pricing, OpenAI plausibly still will have a healthy profit margin. Our median estimate for the profit margin is 55%.
"likely profitable", "median estimate"... that 75% gross margin is not based on hard numbers.
This made me laugh. Thanks for making my Friday a little bit better.
The fixed cause due to R&D is what makes it unprofitable but not each request. Your line of thinking is bit ridiculous because OpenAI is never going to lose money per request.
If your kid makes $50 with a lemonade stand, she thinks she made $50, because she doesn't account for the cost of the lemonade, table, lawn, etc. You're subsidizing your child.
Not until the cost of the previous training has been completed amortised.
Even if some company did immediately stop all training, they would only show a profit until the next SOTA model is released by a competitor, and then they would would go out of business.
None of them have any moat, other than large amounts of venture capital. Even if there is a single winner at the end of all of this, all it would take is a much smaller amount of capital to catch up.
We don't know this for sure. I agree that it would be insane from a business perspective, but I've seen so many SV startups make insane business decisions that I tend to lean towards this being quite possible.
If the amortisation period is too short (what is it now? 8 months? 6 months?) that "profit" from each inference token has to cover the training costs before the amortisation schedule ends.
In short, if you're making a profit of $1 on each unit sold, but require a capex of $10 in order to provide the unit sold, you need to sell at least 10 of those units to break even.
The training is the capex, the inference profit is the profit/unit sold. When a SOTA model lasts only 8 months, the inference has to make all that back in 8 months in order to be considered profitable.
When you factor in the R&D costs required to make these models and the very limited lifespan of a model (and thus extremely high capital investment depreciation rate) the numbers are pretty nasty.
If the trend of staggering AI performance gains stops, you can afford to cut down on R&D and remain competitive. If it doesn't, you hit AGI and break the world economy - with a hope that it'll break in your favor.
They get to convert that compute to inference compute, pivot their R&D towards "figure out how to make inference cheaper" and leverage all the economies of scale.
Sure, if all you ever look at are the token costs, the inferencing costs at the edge, then the narrative that this will never skyrocket in price and the gates to the walled garden won’t ever close seems believable. Once you factor in the R&D, the training, the data acquisition, the datacenters, the electricity and water and real estate and lobbying and shareholder returns…
It’ll be the most expensive product your company pays for, per seat, by miles, once the subsidy period ends and the real bills come due. Even open-weight models are likely to evaporate or shift to some sort of Folding@Home type distributed training model to keep costs low.
I may be wrong, but wasn’t compute part of Microsoft’s 2019 or 2023 investment deals with OpenAI?
How do you get to that conclusion? There is no inference without training, so each sale of a single inference token has a cost that includes both the inference as well as the amortised cost of training.
Right, but that is why I used the word "amortise"; there is only a limited time to recuperate that cost. If you spend $120 in training, and it takes 6 months for the next SOTA to drop from a competitor, you have to make clear $10/m after inference costs.
The way the big AI players are playing supports the assertion that the LLM is plateuing. The differentiator between OpenAI, Gemini, Copilot, Perpexity, Grok, etc is the app and how they find novel ways to do stuff. The old GPT models that Microsoft uses are kneecapped and suck, the Copilot for Office 365 is pretty awesome because it can integrate with the Office graph and has alot of context.
Of the pure-play companies, only OpenAI do this. Like, Anthropic are losing a bunch of money and the vast majority of their revenue comes from API usage.
So, either the training costs completely dominate the inference costs (seems unlikely but maybe) or they're just not great businesses.
I do think that OpenAI/Anthropic are probably hiring a lot of pre and post sales tech people to help customers use the products, and that's possibly something that they could cut in the future.
I’m not sure I understand you. You can use Claude for free just like you can use ChatGPT for free.
For basically an hour. Like, have you tried to do this? I have, and ended up subscribing pretty soon.
Additionally, if you look at Anthropic's revenue the vast, vast majority comes from API (along with most of their users). This is not the case for OpenAI, hence my point.
> Inference APIs aren’t subsidised
This is hard to pin down. There are plenty of metal companies providing hosted inference at market rates (i.e. assumed profitably if heading towards some commodity price floor). The premise that every single one of these companies is operating at a loss is unlikely. The open question is about the "off-book" training costs for the models running on these servers: are your unit economics positive when factoring training costs. And if those training costs are truly off-book, it's not a meritless argument to say the model providers are "subsidizing" the inference industry. But it's not a clear cut argument either.
Anthropic and OpenAI are their own beasts. Are their unit economics negative? Depends on the time frame you're considering. In the mid-longer run, they're staking everything on "most decidedly not negative". But what are the rest of us paying on the day OpenAI posts 50% operating margins?
A lot of people disagreed with this point when I posted it, however Sam Altman said last week:
> We're profitable on inference. If we didn't pay for training, we'd be a very profitable company.
— https://www.axios.com/2025/08/15/sam-altman-gpt5-launch-chat...
The only thing which has gone downhill more is Airbnb.
At best a middling experience these days, and on average a poor experience.
I am using the current models and they are still as useful as 6 or 12 months ago
The deal is still about the same: if you bother to do most of the hard part (thinking it through) the code generators can just about generate all the boilerplate
Yeah. Amazing.
Not the primary point of your post, but I am always evangelizing to my friends about this 'hack.' I can't believe that people are willing to walk half a mile and queue up in the rain/sun/snow to be driven by some random person who will probably make them listen to their demo tape, instead of just taking the myriad taxis that are sitting right there.
Takes probably 20-30 minutes off of my airport commute.
It's kind of undeniable at this point that at least some parts of the AI boom have been really good for society. It just took a while to realize exactly where this was useful.
Not sure about allowing for more control, what happens when a complex/exotic issue that an AI is not able to solve arises? The bosses will have to pay a premium to whatever expert is left to address the problem.
What happens when a complex piece of machinery breaks down at the factory? You call in an expensive mechanic to fix it. You also still have engineers overseeing the day to day operations and processes to make sure that doesn't happen, but the bulk of the work is carried out by semi-skilled labor. It's not hard to imagine software going this way, as it's inevitably what capital wants.
Why are people always speaking about CRUD? In 30 years, I haven't done anything related to CRUD. And I'm also very confused about it.
Or at least cheaper.
https://www.gnu.org/philosophy/who-does-that-server-really-s...
This is not even considering the fact that the performance has also increased.
Right?
Where as to get AI to any sort of approximation of what it's hyped up to be, may involve exponentially higher hardware costs.
So for the longest period of time, AI was sitting in about 90% accuracy. With the use of Nvidia hardware it's going to say 99 to 99.9%. I don't think it's actually 99.9%
To replace humans, I think you effectively need 99.999% and even more depending on the domain like self-driving is probably eight nines.
What's the hardware cost to get each one of those nines linear polynomial? Exponential?
It’s amazing to think that humans have been writing blunder-free code all this time and only AI is making mistakes.
Humans coders, including good ones, make errors all the time and I don’t fully trust the code written by even my strongest team members (including myself; I’m far from the strongest programmer).
And we blame AI for slop and hallucination...
This is why we need the discipline of history. People's memories are nonexistent.
The world hasn't stood still during the past few years! During the pandemic, the tech companies went on a hiring binge for demand that didn't materialize. Then interest rates skyrocketed as the Fed buckled down to fight inflation. Then Elon Musk bought Twitter and laid off 80% of the staff without the service collapsing, showing twitchy tech executives that their companies could get by with less. There were also massive tax code changes, only very recently reversed, that further encouraged less headcount.
We've had so many headwinds against tech employment completely unrelated to AI. Big companies have been responding, but they don't move fast, so they likely still haven't done everything they want to do.
AI is just a convenient excuse, and it's amazing how well it's worked on so many people!
Checking the link, it seems this mistake occurred when generating an image of the names of all the presidents, I believe using gpt-image-1 (the model which GPT-5 will call, but which predates GPT-5). Not to say inaccuracies in generated image text shouldn't also be addressed, but I feel it's relevant context when judging progress.
> The verdict of some users is in. They hate GPT-5. [...] OpenAI to bring back the older, but more reliable, GPT4o model
GPT-5 is top of https://livebench.ai/#/ and top of https://lmarena.ai/leaderboard (blind user-preference ratings). Those leaderboards aren't the be-all-and-end-all, but I feel narratives formed within particular groups can be prone to confirmation bias (with different groups having different narratives and seeing the others as delusional).
> and an Arizona State University study indicate that our current ways of improving LLMs have gone as far as they can go
Unclear to me that a study of a 4-layer GPT-2 on a synthetic task implies specifically now is the stopping point. I don't believe in a singularity/intelligence explosion or that we're close to replacing all human labor, but it does seem that incremental progress has continued through over a decade of people saying deep learning is hitting a wall.
> Kilo Code blog observed, people have been assuming that since "the raw inference costs were coming down fast, the applications' inference costs would come down fast as well but this assumption was wrong."
> [...] Of course, Sam Altman, OpenAI's CEO, can predict "The cost to use a given level of AI falls about 10x every 12 months," but I believe that just as much as I would an Old West snake oil huckster who'd guaranteed me his miracle elixir would grow my hair back.
Wouldn't necessarily trust Altman's predictions in general, but the fact that the cost of a fixed level of capability decreases dramatically over time seems fairly uncontroversial and easy to verify across providers.
That's not inconsistent with the fact that the per-token price of the current best model has stayed approximately the same, which is what the quoted blog post is referring to. I believe that's mostly just determined by how much people are willing to pay (if people are willing to pay more to be a couple months ahead of the current progress curve, scale up the model or increase context/CoT to squeeze out extra performance).
> Other companies, like Microsoft and Google, are sneaking AI charges into your existing software-as-a-service (SaaS) bills
They really are. My Google Workspace bill was increased by $5/month purely because of the "value add" that AI provides.
I turned Gemini off in my tenant (like, chatted with customer support to REALLY turn it off). I have no desire to turn it back on. There is no way to remove the $5/month increase.
At this point, they are literally just stealing $60/year from me (and others) to fund their AI bullshit.
As soon as I find an alternative to Drive and Docs, I am gone. I found this unacceptable.
1. AI adoption has outpaced expectations relative to Moore's law. Silicon Valley startups typically price their services based on anticipated future costs rather than current ones. While it was clear that AI inference costs would decline substantially, the rate of adoption was faster than projections, creating a gap where usage scaled faster than costs could decrease.
2. As companies like Cursor began recruiting talent from Claude, Claude responded by raising prices, recognizing that they were now competing directly with services built by their former employees.
austin-cheney•5mo ago
taylodl•5mo ago
AI? In the hands of a craftsman AI is just another tool that can help boost productivity. In the hands of a junior developer trying to use AI to make it appear that they're a senior developer - no. There are too many downside risks in that scenario because the junior developer doesn't have enough experience to know when the tool is providing bad results.
austin-cheney•5mo ago
Frameworks solve two business problems:
1. Candidate selection
2. Clearer division of labor
That’s it. Everything else is an imaginary quality from the developer. In most cases the well reasoned arguments from developers in favor of large frameworks can be performed faster without the frameworks, both in human writing speed and software execution speed. Typically these imaginary qualities taken a defensive tone from developers who have never learned to execute without the frameworks, which becomes an argument from ignorance because you know what you are arguing for but not what you are arguing against.
At any rate the result is the same that the article makes about AI: output of brittle toolchains.
orsorna•5mo ago
I really can't put frameworks in the same bucket as AI. At least frameworks describe an abstract model for a developer to rationalize and think through. AI allows (but doesn't mandate) a developer to write code that they don't understand.
Perhaps I've worked on business logic so long instead of esoteric efforts; what real world use case would benefit from not leveraging a framework where applicable?
In fact, I see your publicly posted resume; are there really developers out there rawdogging Javascript? What problem space do you hire for that mandates the ignorance of >15 years of JS libraries?
And does your business pays above market rate for these skilled developers? Without understanding the problem space I just assume your business tries to hire talented people at exploitative wages. Regardless it appears to be a staggering waste of talent unless the higher quality significantly reduces the cost center of downtime, bugs etc (I find this hard to believe)
austin-cheney•5mo ago
Performance, security, and portability to start. If you work in a high security environment you should expect to NOT have access to things like Maven or NPM.
I hear so very many people on HN and the real world complain about web bloat. Even the people who contribute to that bloat and cannot live without the contributing factors complain about it. As somebody who is only writing personal software now and working in a completely unrelated field I certainly wouldn't punish myself with bloat that requires far more work than executing without it.
taylodl•5mo ago
There are a myriad of other benefits from using frameworks, so many in fact that to me it's a red flag when someone advocates against them.
With regards to AI, like I said in my original comment it's a tool, not a panacea. Experienced software tradesmen are learning how to effectively wield AI - hint: it's not for writing all your code. I'm kind of excited because we could be on the verge of another great productivity boost like we experienced 30 years ago when we adopted frameworks.
austin-cheney•5mo ago
That is the primary reason I wanted to change employment from writing JavaScript to something else. If I could find JavaScript employment without this insanity then maybe I will reconsider my options. I see AI as more of the same.
brianmcc•5mo ago
Frameworks may not bring magical perfection but they bring a lot of objective benefit to the table.
austin-cheney•5mo ago
reactordev•5mo ago
austin-cheney•5mo ago
AtlasBarfed•5mo ago
Contrasted with specific acronyms frameworks, languages and similar buzzwords that dominate computing recruitment.
If you have a tool that can basically take someone who's really good at programming, architecture, analysis, etc, and eliminates the barriers of domain specific knowledge, syntax idiosyncracies, library peculiarities away ..
It should mean a talented IT professional should actually be more useful across more domains. And hiring should reflect that. It's hard to tell right now because hiring is zero apparently.
For example, I haven't coded at rust. I have encoded in c++ in 20 years. Assuming of course that I am a genius, which of course I am right? Assuming that, llm should allow me to both code in a language I don't have a lot of experience in, and adapt to you or enterprises particular code base far more quickly.
Does my value go up? Probably not. Because now I compete with everyone else. That's pretty smart without domain barriers. That's a large increase in supply.
However, large amounts of IT people who don't even know the basics of computer science architecture or those types of things, Will not have any real value compared to an llm.
With the hypnosis of the executives, they do not see a difference between those two different types of professional. They see an IT budget that they want to axe.
One of the fundamental tensions in it management/labor dispute is that someone that manages and maintains a service and codes... is actually a manager. Good it professionals are providing both technical service and managerial service to a company.
Consider what computers used to do. There used to be a room full of actual human computers on calculators doing things, and of course, a manager that oversight oversaw them.
That manager was clearly considered part of the managerial class.
Then came computers and the room of people disappeared and you still had a manager managing an IT application. But without the head count, Management decided that that person wasn't a manager or a member of the manager class.
Yet they had to pay him like that.
See. I think ultimately llms are taking away some of the technical overhead on managing and Enterprise system for a company. But it still needs to be managed. And that person will still be "IT".
And if you don't pay that person reasonably well, your Enterprise will suffer.
austin-cheney•5mo ago
Yes, there are some fundamental skill deficits and a lot of liars out there. The historical solution to this problem is to ignore it, use some tool to flatten the bell curve, and then hope for the best. If AI is just an evolution of things already tried, then we already know what the results will look like: less accountability, less credibility, less selection risks for candidate selection, and more expensive development processes. For example frameworks allowed substantially wider developer participation at lower product quality and high costs without change to business ambitions. We should expect developers to become more of a less capable commodity than they already were.
Then when the technical debt becomes to expensive to maintain just call in consultants to tell everybody what is already commonly known.
AtlasBarfed•5mo ago
But you're right, it's not going to happen