To me, it is hard to reject this hypothesis today. The fact that Anthropic is rapidly trying to increase price may betray the fact that their recent lead is at the cost of dramatically higher operating costs. Their gross margins in this past quarter will be an important data point on this.
I think the tendency for graphs of model assessment to display the log of cost/tokens on the x axis (i.e. Artificial Analysis' site) has obscured this dynamic.
So there's a push for them to increase revenue per user, which brings us closer to the real cost of running these models.
At that point you are beholden to your shareholders and no longer can eschew profit in favor of ethics.
Unfortunately, I think this is the beginning of the end of Anthropic and Modei being a company and CEO you could actually get behind and believe that they were trying to do "the right thing".
It will become an increasingly more cutthroat competition between Anthropic and OpenAI (and perhaps Google eventually if they can close the gap between their frontier models and Claude/GPT) to win market share and revenue.
Perhaps Amodei will eventually leave Anthropic too and start yet another AI startup because of Anthropic's seemingly inevitable prioritization of profit over safety.
Or they are just not willing to burn obscene levels of capital like OpenAI.
Much of the token usage is in reasoning, exploring, and code generation rather than outputs to the user.
Does making Claude sound like a caveman actually move the needle on costs? I am not sure anymore whether people are serious about this.
To me, caveman sounds bad and is not as easy to understand compared to normal English.
I'm already at 27% of my weekly limit in ONE DAY.
it seems to hallucinate a bit more (anecdotal)
People love to throw around "this is the dumbest AI will ever be", but the corollary to that is "this is the most aligned the incentives between model providers and customers will ever be" because we're all just burning VC money for now.
Please say this louder for everyone to hear. We are still at the stage where it is best for Anthropic's product to be as consumer aligned (and cost-friendly) as possible. Anthropic is loosing a lot of money. Both of those things will not be true in the near future.
https://platform.claude.com/docs/en/about-claude/pricing
So if you are generating more tokens, you are eating up your usage faster
It is like comparing an 8K display to a 16K display because at normal viewing distance, the difference is imperceptible, but 16K comes at significant premium.
The same applies to intelligence. Sure, some users might register a meaningful bump, but if 99% can't tell the difference in their day-to-day work, does it matter?
A 20-30% cost increase needs to deliver a proportional leap in perceivable value.
For coding though, there is kind of no limit to the complexity of software. The more invariants and potential interactions the model can be aware of, the better presumably. It can handle larger codebases. Probably past the point where humans could work on said codebases unassisted (which brings other potential problems).
Gamblers (vibe-coders) at Anthropic's casino realising that their new slot machine upgrade (Claude Opus) is now taking 20%-30% more credits for every push of the spin button.
Problem is, it advertises how good it is (unverified benchmarks) and has a better random number generator but it still can be rigged (made dumber) by the vendor (Anthropic).
The house (Anthropic) always wins.
> People just want free tools forever?
Using local models are the answer to this if you want to use AI models free forever.
Re-ran the bake-off with 4.7 authoring and… gpt5.4 still clearly winning. Same skills, same prompts, same agents.md.
> In Claude Code, we’ve raised the default effort level to xhigh for all plans.
Try changing your effort level and see what results you get
uberman•1h ago
Given that Opus 4.6 and even Sonnet 4.6 are still valid options, for me the question is not "Does 4.7 cost more than claimed?" but "What capabilities does 4.7 give me that 4.6 did not?"
Yesterday 4.6 was a great option and it is too soon for me to tell if 4.7 is a meaningful lift. If it is, then I can evaluate if the increased cost is justified.
pier25•57m ago
ed_elliott_asc•52m ago
solenoid0937•50m ago
https://marginlab.ai/trackers/claude-code-historical-perform...
Majromax•36m ago
Moreover, on the companion codex graphs (https://marginlab.ai/trackers/codex-historical-performance/), you can see a few different GPT model releases marked yet none correspond to a visual break in the series. Either GPT 5.4-xhigh is no more powerful than GPT 5.2, or the benchmarking apparatus is not sensitive enough to detect such changes.
cbg0•35m ago
addisonj•29m ago
But... Are you really going to completely rely on benchmarks that have time and time again be shown to be gamed as the complete story?
My take: It is pretty clear that the capacity crunch is real and the changes they made to effort are in part to reduce that. It likely changed the experience for users.
grim_io•50m ago
nfredericks•43m ago
grim_io•11m ago
Jeremy1026•30m ago