In legal tech, we run domain-specific models for contract review that use 90% fewer tokens than general-purpose LLMs because they understand legal document structure natively. The token cost per document dropped from dollars to cents.
The real "tokenpocalypse" is for use cases that try to do everything with one general model. As the ecosystem matures toward specialized tools (similar to how we got specialized IDEs for different programming languages), token efficiency improves dramatically.
The analogy holds: general-purpose models are like Swiss Army knives — useful but inefficient. Domain-specific models are like proper tools — more expensive upfront but vastly more efficient for their domain.
Assuming the intelligence of a model continuously improves with scale, the token price of the best model will become increasingly expensive.
I know that tokens are currently experiencing rapid price drops, but they will eventually encounter physical limitations.
The corporate side seems to be well... stupid? Execs asking their people to burn tokens do not understand the politics and cadence of business. Corporations do not actually demand more work to be completed in the way we traditionally think. Creating a lot of stuff in a corporation tends to naturally banish most of it to the void because that stuff requires other people to exist and engage with it in order to use it, deploy it, get customers using it, etc. AI does not take up that slack in the way that we are being told because it lacks agency. For most people in corporations the problem is not that they can't do their work, their real jobs are mostly being political nodes in a vast system. There is no solution on the table to change that at all.
- The frontier AI companies have realized they won't be able to count on gaining ground and earning more in the future through sheer moat. They have to start earning right now.
- The playing field on the market got a whole lot more even as a result. Now everyone is competing on cost and quality - while there are still a lot of competition. AI can't easily get away with subsidizing their own product and enshittify later.
I might be missing something obvious here? It feels to me that if the frontier AI companies thought they could gain a lot more moat they wouldn't raise their prices this much this early? And their current moats/head start doesn't seem insurmountable?
They have to do it in reverse order which seems to be maybe impossible. I contend that SOTA models are still quite bad at what their companies claim them to be good at. They remain confidently wrong more often than they should be. The public also is tired of 'slop' and will continue to push back on it.
I don't think you're missing anything, but I am surprised that the forces behind the AI companies did. They do need to start making money, but I don't think anyone has a plan as to how they are going to do this. As for enshittification, that was always on the table for the free tier, it was also going to be the drug deal strategy, were the first hit is free.
The cost of AI is still to high, datacenters aren't being completed, the hardware is to expensive, electricity is to expensive, the technology is good, but requires hand-holding. We're going to see AI being deploy more sparingly and more targeted, so the cost is justified.
Of course the question remains, who is supposed to be buying products through this system if AI systems continue to displace jobs?
When the interaction is exploratory, the marginal cost feels invisible: ask again, summarize again, try another agent. In a business workflow, the same pattern becomes a metering problem. You have to decide which parts actually need a frontier model, which can use a smaller/local model, and which should not be generated at all.
That probably pushes AI products away from "chat with everything" and toward much narrower tools with explicit ROI: less open-ended generation, more constrained pipelines, caching, evaluation, and human review at the points where mistakes are expensive.
It depends where you buy the tokens from. Jevon's paradox exists in China and not in the US for now.
> In just a few months, companies became obsessed with “tokenmaxxxing,” then turned against it due to the high costs.
Casinos (in the US) telling customers to spend more on tokens, introduces free spins, discounts, resetting limits on peak hours. Then introduces new slot-machine that promises to give better odds to the gamblers, but instead is more expensive to use.
The ones in China did the opposite and made their discount on tokens permanent.
All this 'tokenmaxxing' was an outright scam. Now the AI companies want you 'tokenmaxxing' your agents on loops as the token prices increase.
Pretty sure from inception the phrase “tokenmaxxing” was never seen in a positive light…
Here are my concrete predictions
1. Token costs will come down and performance will go up
2. Everyone will spend even more on LLMs not less - the article points at small blips but if anyone thinks it will go down from now, you are mistaken
3. AI Companies will be profitable
If anyone wants to counter bet on me, please go ahead.
but many of the current crop will never return money to investors.
I largely agree with you, but the huge investments currently being made will be very hard to get a return on. Token costs will come down, performance will go up, and you want to be in the business of selling the picks & shovels, not doing the mining.
Which is of course why nvidia, google & TSMC are in pretty good positions, but even their valuations have some bubble in them.
I mean this is a sort of conspiracy theory and I genuinely don't know why people think AI is particularly hard to get money back from?
> I largely agree with you, but the huge investments currently being made will be very hard to get a return on.
Why do you find it huge? Anthropic went from $1B to $44B revenue in a few months and this is unprecedented.
1. The margins on inference are huge
2. There is genuine moat because AI models have personalities strengths and weaknesses that's so they are definitely not fungible
I think a lot of handwaving goes on but it comes in the form of some latent concern that AI might just be profitable. But the reality is that it will be.
None of the "selling picks and shovels" analogies will stick.
as Jensen said, get ready for $1000 per mil token
those for which this price makes sense will push out those for which it doesn't - to lower models or to local models
but those who want to run local models need to compete for hardware with the data centers, which have strong scale effects thus will always be able to out price local hardware allocations - can already be seen now as hardware makers get out of retail business
Hoping your customer base is so old they forget to cancel the subscription might not work so well this time. “Popcorn eating ensues”
I have the feeling that the age of 'i can't be blamed by AI stuff' will be a "this was the computer guy mistake" for a moment.
PS. I've been using Claude opus 4.8 and it is worse than 4.6 and I will say that even sonnet 4.6 is better. PhD. Level of software and engineering I believe! I know many PhD who never coded or worked anyway
Anybody doing things seriously understand how to optimize their workflows for smaller models once they start to lock in processes.
There are so many useless cases such as people bragging about their token consumption that has no product and no value add, or those with OpenClaw doing useless automation that could be a Python script.
If you're following a bunch of people who are from LLM labs, you're going to be more incentivised to tokenmaxx because it's in the Lab's best interest tonget you to behave that way.
Practically, many companies aren't labs with endless runway. Companies hopefully follow a PnL model. And when you look at things with that lens, many of the times the LLM use case falls apart.
You're seeing a bunch of companies starting to realise that tokenmaxing yields very little ROI.
Even the LLM labs, the guy that spent $1+mil tokens has nothing to show for it in terms of revenue to the company. And you have to keep sinking that much into AI for ... "features".
There are some good use cases for AI. I ended up with a positive ROI on a greenfield project myself, albeit on a small scale.
The way that AI has been making people have totally irrational decisions on executive, pure business and technical standpoints is simply mindblowing. I don't understand how people can't take a step back and see what's actually happening from a macro perspective.
Eh... this is HN. The goal is precisely to reach BS escape velocity and SpaceX is the model to follow. It's not healthy IMHO (I'm not an economist) but that's definitely the arm race VCs actually fund. Lose for years if not decades, achieve market dominance, squeeze. Very very few winners and for those the path is precisely NOT to follow PnL.
This to shall pass. After enough bullshit people will become fed up and enforcement of existing laws will start breaking up the most egregious items. New laws will pass. People will make and lose fortunes, and we will live on.
Anecdotal experience - my coworkers will use the "max-think" and the most expensive model on every change they do with Claude, pumping out 100k's of tokens just because they can (and brag about hitting the limits).
I suspect this kind of behaviour will need to change in the very near future.
[0] - https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headline...
Using Github changing their Copilot's business model as some indication of a "Tokenpocalypse" is not a good reference. Github had a terrible, request-based pricing model that everybody who used it abused.
kimi-k2.6 can do a pretty damn good job with vision for optimizing ui design workloads in a loop. not cheap but significantly cheaper than anthropic.
mimo 3 is jsut pretty damn good when you need a high end reasoning model - also reletivly affordable.
I was able to run gemma and do some coding locally on a 32 gb machine. it was slow as molasses but the fact that it worked at all on a local machine that wasn't desinged around AI workloads is great.
Its only a tokenpocalypse if you rely on these closed and frankly overpriced american models. is opus better than kimik2.6? arguably yes but not 16 times better from what I've been seeing.
Doesn't this just mean price increase ? What is not clear is how much the price needs to increase for AI companies to break even some time. 3x increase ? 10x increase ? Even more ? No one seems willing to give a clear number.
and we are fast approaching limits which will be hard to overcome - electricity, chips
The push by companies to incorporate AI into everything is (depending on the company) either hype and cargo-culting or it was an attempt by management to (1) try and discover if/what new workflows or tools could use it and (2) force the haters to use as it got better.
Where I work, there is an obvious split between people who have been willing to use AI, and those that hated it from day 1 and mocked the "stochastic parrots". Senior folks were disproportionately haters, and generally didn't see much productivity lift from early AI stuff. They strongly resisted the mandates to use AI, and completely missed the "agentic" inflection point that other colleagues experienced. The more willing users saw Claude Code/agents and were able to experience this as the genuine benefit it can be. Now that the more senior folks are using agentic programming, they're genuinely able to maintain code quality and see meaningful speed improvements in coding tasks.
Today, tokenmaxxing doesn't make sense because we found the product-market-fit of agentic coding. Now that most (?) employees are onboard with using it, the industry can shift focus to cost-effective usage and positive-ROI usage. For example, Uber shifting to a fixed per-employee token budget.
AI could be absolutely perfect and we'd still struggle to deploy it in a value generating way simply because it will exceed our ability to adapt.
So tokenmaxxing might be the wrong thing to do, but only because it's focussing on the wrong problem rather than because it doesn't actually work.
xvxvx•1h ago
I knew right there and then that he was a moron. There’s something about American companies where the best and brightest rarely show up in senior management. It seems to be populated by some weird class of golf playing NPCs that figured out how to game the system and bring all their cult members along for the ride.
My own company spent 2+ years enforcing extreme austerity, to the point of firing the very people who built everything, only to run wild with AI spending and seeing little results from it.
Surely, out there in the wilderness, there is a company staffed by intelligent, skilled people. Right?
lifestyleguru•56m ago
npodbielski•52m ago
vrganj•34m ago
Musk. Zuck. Bezos.
All three are buddying up with government officials, all three routinely embarrass themselves when they try to talk shop.
Only difference is they're much more socially awkward and less superficially charming than the stereotype would suggest.