Anthropic tightens usage limits for Claude Code without telling users

https://techcrunch.com/2025/07/17/anthropic-tightens-usage-limits-for-claude-code-without-telling-users/

179•mfiguiere•3h ago

Comments

jablongo•3h ago

Id like to hear about the tools and use cases that lead people to hit these limits. How many sub-agents are they spawning? How are they monitoring them?

Capricorn2481•3h ago

I'm not on the pro plan, but on $20/mo, I asked Claude some 20 questions on architecture yesterday and it hit my limit.

This is going to be happening with every AI service. They are all burning cash and need to dumb it down somehow. Whether that's running worse models or rate limiting.

rancar2•3h ago

There was a batchmode pulled from the documentation after the first few days of the Claude Code release. Many of have been trying to be respectful with a stable 5 agent call but some people have pushed those limits much higher as it wasn’t being technically throttle until last week.

WJW•2h ago

Tragedy of the commons strikes again...

micromacrofoot•3h ago

I've seen prompts telling it to spawn an agent to review every change it makes... and they're not monitoring anything

TrueDuality•3h ago

One with only manual interactions and regular context resets. I have a couple of commands I'll use regularly that have 200-500 words in them but it's almost exclusively me riding that console raw.

I'm only on the $100 Max plan and stick to the Sonnet model and I'll run into the hard usage limits after about three hours, that's been down to about two hours recently. The resets are about every four hours.

BolexNOLA•3h ago

Ha was just talking about this coming down the pipeline with folks days ago (in so many words) https://news.ycombinator.com/context?id=44565481

Aurornis•3h ago

I played with Claude Code using the basic $20/month plan for a toy side project.

I couldn't believe how many requests I could get in. I wasn't using this full-time for an entire workweek, but I thought for sure I'd be running into the $20/month limits quickly. Yet I never did.

To be fair, I spent a lot of time cleaning up after the AI and manually coding things it couldn't figure out. It still seemed like an incredible number of tokens were being processed. I don't have concrete numbers, but it felt like I was easily getting $10-20 worth of tokens (compared to raw API prices) out of it every single day.

My guess is that they left the limits extremely generous for a while to promote adoption, and now they're tightening them up because it’s starting to overwhelm their capacity.

I can't imagine how much vibe coding you'd have to be doing to hit the limits on the $200/month plan like this article, though.

dawnerd•3h ago

I hit the limits within an hour with just one request in CC. Not even using opus. It’ll chug away but eventually switch to the nearing limit message. It’s really quite ridiculous and not a good way to upsell to the higher plans without definitive usage numbers.

MystK•3m ago

Use `npx ccusage` if you're interested in how much it would have costed if you paid by API usage.

cladopa•3h ago

Thinking is extremely inefficient compared with the usual query in Chat.

If you think a lot, you can spend hundreds of dollars easily.

eddythompson80•2h ago

Worth noting that a lot of these limits are changing very rapidly (weekly if not daily) and also depend on time of day, location, account age, etc.

ChadMoran•1h ago

If you aren't hitting the limits you aren't writing great prompts. I can write a prompt and have it go off and work for about an hour and hit the limit. You can have it launch sub-agents, parallelize work and autonomously operate for long periods of time.

Think beyond just saying "do this one thing".

mrits•1h ago

That clears up a lot for me. I don't think I've ever had it take for than a couple of minutes. If it takes more than a minute I usually freak out and press stop

stogot•1h ago

How is that a great prompt having it run for an hour without your input? Sounds like it’s just generating wasteful output.

cma•32m ago

It can be fixing unit tests and stuff for quite a while, but I usually find it cheats the goal when unattended.

buremba•3h ago

They're likely burning money so I can't be pissed off yet, but we see the same Cursor as well; the pricing is not transparent.

I'm paying for Max, and when I use the tooling to calculate the spend returned by the API, I can see it's almost $1k! I have no idea how much quota I have left until the next block. The pricing returned by the API doesn't make any sense.

roxolotl•2h ago

A coworker of mine claimed they've been burning $1k a week this month. Pretty wild it’s only costing the company $200 a month.

gerdesj•2h ago

Crikey. Now I get the business model:

I hire someone for say £5K/mo. They then spend $200/mo or is it a $1000/wk on Claude or whatevs.

Profit!

AtheistOfFail•1h ago

The model is "outspend others until they're bankrupt".

Known as the Uber model or Amazon vs Diapers.com

devnullbrain•1h ago

It's a shame that the LLM era missed out on coinciding with the zero interest rates era. Just imagine the waste we could create.

margalabargala•1h ago

> Amazon vs Diapers.com

To be fair that was a little different; Amazon wanted to buy the parent company of Diapers.com so sold at a loss to tank the value of the company so they could buy it cheap.

Terretta•1h ago

Wasn't there a stat somewhere that a good o3-pro deep research was ~$3500, per question?

sothatsit•1h ago

I highly doubt that was ever the case in the UI version. You're probably thinking of when they benchmarked o3-high on ARC-AGI and it cost $3440 per question.

dfsegoat•2h ago

Can you clarify which tooling you are using? Is it cursor-stats?

neom•1h ago

We just came out of closed alpha yesterday and have been trying to figure out how best to price, if you'd be willing to provide any feedback I'd certainly appreciate it: https://www.charlielabs.ai/pricing - Thank you!! :)

iwontberude•3h ago

Claude Code is not worth the time sink for anyone that already knows what they are doing. It's not that hard to write boilerplate and standard llm auto-predict was 95% of the way to Claude Code, Continue, Aider, Cursor, etc without the extra headaches. The hangover from all this wasted investment is going to be so painful.

Sevii•3h ago

I've spent far too much of my life writing boilerplate and API integrations. Let Claude do it.

axpy906•3h ago

I agree. It’s a lot faster to tell it what I want and work on something else in the meantime. You end up ready code diffs more than writing code but it saves time.

serf•3h ago

>Claude Code is not worth the time sink

there are like 15~ total pages of documentation.

There are two folders , one for the home directory and one for the project root. You put a CLAUDE.md file in either folder which essentially acts like a pre-prompt. There are like 5 'magic phrases' like "think hard", 'make a todo', 'research..' , and 'use agents' -- or any similar set of phrases that trigger that route.

Every command can be ran in the 'REPL' environment for instant feedback, it itself can teach you how to use the product, and /help will list every command.

The hooks document is a bit incomplete last I checked, but it's a fairly straightforward system, too.

That's about it -- now explain vi/vim/emacs/pycharm/vscode in a few sentences for me. The 'time sink' is like 4 hours for someone that isn't learning how to use the computer environment itself.

freedomben•2h ago

Yeah, Claude Code was by far the quickest/easiest for me to get set up. The longest part was just getting my API key

Implicated•2h ago

Comments like this remind me that there's a whole host of people out there who have _no idea_ what these tools are capable of doing to ones productivity or skill set in general.

> It's not that hard to write boilerplate and standard llm auto-predict was 95% of the way to Claude Code, Continue, Aider, Cursor, etc without the extra headaches.

Uh, no. To start - yea, boilerplate is easy. But like a sibling comment to this one said - it's also tedious and annoying, let the LLM do it. Beyond that, though, is that if you apply some curiosity and that "anyone that already knows what they are doing" level prior knowledge you can use these tools to _learn_ a great deal.

You might think your way of doing things is perfect, and the only way to do them - but I'm more of the mindset that there's a lot of ways to skins most of these cats. I'm always open to better ways to do things - patterns or approaches I know nothing about that might just be _perfect_ for what I'm trying to do. And given that I do, in general, know what I'm asking it to do, I'm able to judge whether it's approach is any good. Sometimes it's not, no big deal. Sometimes it opens my mind to something I wasn't aware of, or didn't understand or know would apply to the given scenario. Sometimes it leads me into rabbit holes of "omg, that means I could do this ... over there" and it turns into a whole ass refactor.

Claude code has broadened my capabilities, professionally, tremendously. The way it makes available "try it out and see how it works" in terms of trying multiple approaches/libraries/databases/patterns/languages and how those have many times led me to learning something new - honestly, priceless.

I can see how these tools would scare the 9-5 sit in the office and bang out boilerplate stuff, or to those who are building things that have never been done before (but even then, there's caveats, IMO, to how effective it would/could be in these cases)... but to people writing software or building things (software or otherwise) because they enjoy it or because they're financial or professional lives depend on what they're building - absolutely astonishing to me anyone who isn't embracing these tools with open arms.

With all that said. I keep the MCP servers limited to only if I need it in that session and generally if I'm needing an MCP server in an on-going basis I'm better off building a tool or custom documentation around that thing. And idk about all that agent stuff - I got lucky and held out for Claude Code, dabbled a bit with others and they're leagues behind. If I need an agent I'ma just tap on CC, for now.

Context and the ability to express what you want in a way that a human would understand is all you need. If you screw either of those up, you're gonna have a bad time.

adamtaylor_13•1h ago

Well said. People seem to be binary: I code with it or I don’t.

Very few folks are talking about using the LLMs to sharpen THE DEVELOPER.

Just today I troubleshot an issue that likely would’ve taken me 2-3 hours without additional input. I wrapped it up and put a bow on it in 15 minutes. Oh, and also wrote a CLI tool fix the issue for me next time. Oh and wrote a small write up for the README for anyone else who runs into it.

Like… if you’re not embracing these tools at SOME level, you’re just being willfully ignorant at this point. There’s no badge of honor for willfully staying stuck in the past.

jmartrican•3h ago

I have the $100 plan and now quickly get downgraded to Sonnet. But so far have not hit any other limits. I use it more on the weekends over several hours, so lets see what this weekend has in store.

I suspected that something like this might happen, where the demand will outstrip the supply and squeeze small players out. I still think demand is in its infancy and that many of us will be forced to pay a lot more. Unless of course there are breakthroughs. At work I recently switched to non-reasoning models because I find I get more work done and the quality is good enough. The queue to use Sonnet 3.7 and 4.0 is too long. Maybe the tools will improve reduce token count, e.g. a token reducing step (and maybe this already exists).

j45•1h ago

Off hour usage seems to be different for sure.

Also there's likely only so much fixed compute available, and it might be getting re allcoated for other uses behind the scene from time to time as more compute arrives.

blibble•3h ago

the day of COGS reckoning for the "AI" industry is approaching fast

apwell23•3h ago

oh yea looks like everyone and their grandma is hitting claude code

https://github.com/anthropics/claude-code/issues/3572

Inside info is they are using their servers to prioritize training for sonnet 4.5 to launch at the same time as xAI dedicated coding model. xAI coding logic is very close to sonnet 4 and has anthropic scrambling. xAI sucks at making designs but codes really well.

38•2h ago

Claude is absolute trash. I am on the paid plan and repeatedly hit the limits. and their support is essentially non existing, even for paid accounts

thr0waway001•2h ago

> One user, who asked not to be identified, said it has been impossible to advance his project since the usage limits came into effect.

Vibe limit reached. Gotta start doing some thinking.

dude250711•2h ago

He did not pass the vibe check.

bGl2YW5j•1h ago

Came to comment on the same quote.

I'm surprised, but know I shouldn't be, that we're at this point already.

mattigames•1h ago

I would be a little disappointed if that wasn't the case, after all we have been there quite a while in regards to the art models.

mrits•1h ago

First one was free

mrits•1h ago

I honestly feel sorry for these vibe coders. I'm loving AI in a similar way that I loved google or IDE magic. This seems like a far worst version of those developers that tried to build an entire app with Eclipse or Visual Studio GUI drag and drop from the late 90s

m4rtink•59m ago

Who would have though including a hard depedency on third part service with unclear long term availability would be a problem!

Paid compilers and remotely acessible mainframes all over again - people apparently never learn.

manquer•37m ago

> Paid compilers.

I don't think this one is a good comparison.

Once you had the binary, the compiler worked forever[1]

The issue with them was around long term support for bugs and upgrade path as the language evolved.

---

[1] as long you had a machine capable of running/emulating the instruction set for the binary.

skort•8m ago

Right, but these companies are selling their products on the basis that you can offload a good amount of the thinking. And it seems a good deal of investment in AI is also based on this premise. I don't disagree with you, but it's sorta fucked that so much money has been pumped into this and that markets seem to still be okay with it all.

globular-toast•2h ago

This is what really makes me sceptical of these tools. I've tried Claude Code and it does save some time even if I find the process boring and unappealing. But as much as I hate typing, my keyboard is mine and isn't just going to disappear one day, have its price hiked or refuse to work after 1000 lines. I would hate to get used to these tools then find I don't have them any more. I'm all for cutting down on typing but I'll wait until I can run things entirely locally.

bigiain•2h ago

> my keyboard is mine and isn't just going to disappear one day, have its price hiked or refuse to work after 1000 lines.

I dunno, from my company or boss's perspective, there are definitely days where I've seriously considered just disappearing, demanding a raise, or refusing to work after the 3rd meeting or 17th Jira ticket. And I've seen cow orkers and friends do all three of those over my career.

(Perhaps LLMs are closer to replacing human developers that anyone has realized yet?)

MisterSandman•2h ago

I guess the argument has time goes on AI will get cheaper and more efficient.

…but idk how true that, I think it’s pretty clear that these companies are using the Uber model to attract customers, and the fact that they’re already increasing prices or throttling is kind of insane.

khurs•2h ago

All you people who were happy to pay $100 and $200 a month have ruined it for the rest of us!!

rob•2h ago

I don't think CLI/terminal-based approaches are going to win out in the long run compared to visual IDEs like Cursor but I think Anthropic has something good with Claude Code and I've been loving it lately (after using only Cursor for a while.) Wouldn't be surprised if they end up purchasing Cursor after squeezing them out via pricing and then merging Cursor + Claude Code so you have the best of both worlds under one name.

ladon86•2h ago

I think it was just an outage that unfortunately returned 429 errors instead of something else.

sneilan1•2h ago

So far I’ve had 3-4 Claude code instances constantly working 8-12 hours a day every day. I use it like a stick shift though. When I need a big plan doc, switch to recommended model between opus and sonnet. And for coding, use sonnet. Sometimes I hit the opus limit but I simply switch to sonnet for the day and watch it more closely.

mpeg•2h ago

Honest question: what do you do with them? I would be so fascinated to see a video of this kind of workflow… I feel like I use LLMs as much as I can while still being productive (because the code they generate has a lot of slop) and still barely use the agentic CLIs, mostly just tab completion through windsurf, and Claude for specific questions by steering the context manually pasting the relevant stuff

sneilan1•2h ago

I focus more on reading code & prompting claude to write code for me at a high level. I also experiment a lot. I don't write code anymore by hand except in very rare cases. I ask claude for questions about the code to build understanding. I have it produce documentation, which is then consumed into other prompts. Often, claude code will need several minutes on a task so I start another task. My coding throughput on a day to day basis is now the equivalent of about 2-3 people.

I also use gemini to try out trading ideas. For example, the other day I had gemini process google's latest quarterly report to create a market value given the total sum of all it's businesses. It valued google at $215. Then I bought long call options on google. Literally vibe day trading.

I use chat gpt sora to experiment with art. I've always been fascinated with frank lloyd wright and o4 has gotten good enough to not munge the squares around in the coonley playhouse image so that's been a lot of fun to mess with.

I use cheaper models & rag to automate categorizing of my transactions in Tiller. Claude code does the devops/python scripting to set up anything google cloud related so I can connect directly to my budget spreadsheet in google sheets. Then I use llama via openrouter + a complex RAG system to analyze my historical credit card data & come up with accurate categorizations for new transactions.

This is only scratching the surface. I now use claude for devops, frontend, backend, fixing issues with embedder models in huggingface candle. The list is endless.

aoaoaoans•1h ago

Can you share some code? I work with a guy like this who claims this level of output but in reality he consumes massive amounts of other devs time in PR review.

Are you doing a lot of broad throwaway tasks? I’ve had similar feelings when writing custom code for my editor, one off scripts, etc but it’s nothing I would ever put my professional reputation behind.

sneilan1•55m ago

Sorry, most of my code is proprietary. However, I have a stock exchange project on my github I plan to rewrite in rust. I'm pretty busy now at work but I'll do that using claude code.

If your friend is consuming massive amounts of other dev time in PR reviews, maybe he has other issues. I'm willing to bet even without agentic coding, he would still be problem for your coworkers.

Sometimes I do broad throwaway tasks. For example I needed a rust lambda function that would do appsync event authorization for jwt tokens. All it needed to do was connect to aws secrets, load up the keys & check inbound requests. I basically had claude-code do everything from cdk to building/testing the rust function & deploying to staging. It worked great! However, I've certainly had my fair share of f-ups like I recently tried doing some work on the frontend with claude code and didn't realize it was doing useEffect everywhere!! Whoops. So I had to adapt and manage 2-3x claude code instances extremely closely to prevent that from happening again.

sneilan1•40m ago

As a follow-up, I've gotten much much faster at modeling code in my mind and directly translating it into prompts. It really changes how you code! For each task, I'm extremely specific about what I want and depending on how closely claude does what I want, I change my specificity. Sometimes like with the lambda function, I can be high level and with my react.js codebase, due to it's lack of types (I know...) needs extra attention.

To be effective with agentic coding, you have to know when to go high level and low level. And have to accept that sometimes agentic coders need a lot of help! It all depends on how much context you give it.

jasonthorsness•2h ago

Is it really worth it to use opus vs. sonnet? sonnet is pretty good on its own.

Ataraxic•2h ago

I need to see a video of what people are doing to hit the max limits regularly.

I find sonnet really useful for coding but I never even hit basic limits. at $20/mo. Writing specs, coming up with documentation, doing wrote tasks for which many examples exist in the database. Iterate on particular services etc.

Are these max users having it write the whole codebase w/ rewrites? Isn't it often just faster to fix small things I find incorrect than type up why I think it's wrong in English and have it do a whole big round trip?

adamtaylor_13•2h ago

I couldn’t even get it to do simple tasks for me this week on the max plan. It’s not just max users overloading it. It feels like they’re randomly rate limiting users.

One day my very first prompt in the morning was blocked. Super strange.

Ensorceled•1h ago

> Isn't it often just faster to fix small things I find incorrect than type up why I think it's wrong in English and have it do a whole big round trip?

This is my experience: at some point the AI isn't converging to a final solution and it's time to finish the rest by hand.

bluefirebrand•1h ago

My experience is that if the AI doesn't oneshot it, it's faster to do it myself

If you find yourself going back and forth with the AI, you're probably not saving time over a traditional google search

Edit: and it basically never oneshots anything correctly

nh43215rgb•1h ago

Are you using claude code for coding with sonnet? Just claude web use alone is indeed fairly relaxed i think.

sothatsit•1h ago

I can tell you how I hit it: Opus and long workflows.

I have two big workflows: plan and implement. Plan follows a detailed workflow to research an idea and produce a planning document for how to implement it. This routinely takes $10-30 in API credits to run in the background. I will then review this 200-600 line document and fix up any mistakes or remove unnecessary details.

Then implement is usually cheaper, and it will take that big planning document, make all the changes, and then make a PR in GitHub for me to review. This usually costs $5-15 in API credits.

All it takes is for me to do 3-4 of these in one 5-hour block and I will hit the rate-limit of the $100 Max plan. Setting this up made me realise just how much scaffolding you can give to Opus and it handles it like a champ. It is an unbelievably reliable model at following detailed instructions.

It is rare that I would hit the rate-limits if I am just using Claude Code interactively, unless I am using it constantly for hours at a time, which is rare. Seems like vibe coders are the main people who would hit them regularly.

vineyardmike•15m ago

This is very interesting as a workflow. How “big” are the asks you’re giving Claude? Can you give an example of the type of question you’d ask it to implement where it requires a discrete planning document that long?

Whenever I use it, I typically do much smaller asks, eg “add a button here”, “make that button trigger a refresh with a filter of such state…”

martinald•2h ago

I'm not sure this is "intentional" per se or just massively overloaded servers because of unexpected demand growth and they are cutting rate limits until they can scale up more. This may become permanent/worse if the demand keeps outstripping their ability to scale.

I'd be extremely surprised if Anthropic picked now of all times to decide on COGS optimisation. They potentially can take a significant slice of the entire DevTools market with the growth they are seeing, seems short sighted to me to nerf that when they have oodles of cash in bank and no doubt people hammering at their door to throw more cash at them.

andix•1h ago

A lot of people switched away from Cursor within the blink of an eye. Switching IDEs is a big deal for me - it takes a lot of effort, which is why I never switched to Cursor in the first place.

I think Claude Code is a much better concept, the coding agent doesn't need to be connected to the IDE at all. Which also means you can switch even faster to a competitor. In that sense, Claude Code may have been a huge footgun. Gaining market share might turn out to be completely worthless.

mattnewton•1h ago

I think in the case of Cursor, they are one of may VScode forks, so a switch is not really very challenging. I agree there is little to keep me on any individual app or model (which is one reason I think cursor's reported 9b valuation is a little crazy!)

andix•1h ago

Only if you're using VS code in the first place. VS code is fine for web dev and js/ts/python. But I really don't like it for Java, C#, C++, SQL, and many more.

adamtaylor_13•2h ago

That’s funny I literally started the $200/month plan this week because I routinely spend $300+/month on API tokens.

And I was thinking to myself, “How does this make any sense financially for Anthropic to let me have all of this for $200/month?”

And then I kept getting hit with those overloaded api errors so I canceled my plan and went back to API tokens.

I still have no idea what they’re doing over there but I’ll happily pay for access. Just stop dangling that damn $200/month in my face if you’re not going to honor it with reasonable access.

pembrook•1h ago

The funny thing is Claude 4.0 isn't even that 'smart' from a raw intelligence perspective compared to the other flagship models.

They've just done the work to tailor it specifically for proper tool using during coding. Once other models catch up, they will not be able to be so stingy on limits.

Google has the advantage here given they're running on their own silicon; can optimize for it; and have nearly unlimited cashflows they can burn.

I find it amusing nobody here in the comments can understand the scaling laws of compute. It seems like people have a mental model of Uber burned into their head thinking that at some point the price has to go up. AI is not human labor.

Over time the price of compute will fall, not rise. Losing money in the short term betting this will happen is not a dumb strategy given it's the most likely scenario.

I know everybody really wants this bubble to pop so they can make themselves feel smart for "calling it" (and feel less jealous of the people who got in early) and I'm sure there will be a pop, but in the long term this is all correct.

macinjosh•1h ago

Prices for yesterday's frontier models will fall but there will always be the next big model. similar to how game graphics get ever better but ever more demanding at the bleeding edge.

carlhjerpe•1h ago

Yes but games also look an awful lot better (fidelity wise) than not so many years ago.

andix•1h ago

The thing is, all the models are not that 'smart'. None of them is AGI.

Currently it's much more important to manage context, split tasks, retry when needed, not getting stuck in an infinite loop, expose the right tools (but not too many), ...

alphager•1h ago

Even if Moore's law was still in effect and the computer resources required stayed the same and compute stayed as efficient per watt (neither is true), it would just halve compute costs every 18 months. You're able to read about people hitting $4000 costs/month on the $200 plan upthread. That's 8 years until it's cost effective.

Are they really ready to burn money for 8 years?

pembrook•1h ago

Uber operated at a loss for 9 years. They're now a profitable, market-winning business.

Amazon operated at a loss for 9 years, and barely turned a profit for over a decade longer than that. They're now one of the greatest businesses of all time.

Spotify operated at a loss for 17 years until becoming profitable. Tesla operated at a loss for 17 years before turning a profit. Palantir operated at a loss for 20 years before turning a profit.

And this was before the real age of Big Tech. Google has more cashflows they can burn than any of these companies ever raised, combined.

klik99•51m ago

Those aren’t good comparisons.

Uber operated at a loss to destroy competition and raised prices after they did that.

Amazon (the retailer) did the same and leveraged their position to enter new more lucrative markets.

Dunno about Spotify, but Tesla and palantir both secured lucrative contracts and subsidies.

Anthropic is against companies with deeper pockets and can’t spend to destroy competition, their current business model can only survive if they reduce costs or raise prices. Something’s got to give

pembrook•44m ago

They are good comparisons. All startups go against incumbents/competitors with deeper pockets.

Re: Anthropic specifically, I tend to agree, hence why I'm saying the deeper pockets (eg. Google, Amazon, etc) are perfectly positioned to win here. However, big companies have a way of consistently missing the future due to internal incentive issues. Google is deathly afraid of cannibalizing their existing businesses.

Plus, there's many investors with deep pockets who would love to get in on Anthropic's next round if their technical lead proves to be durable over time (like 6 months in AI terms).

This fight is still early innings.

sothatsit•1h ago

I think people also expect models to be optimised over time. For example, the 5x drop in cost of o3 was probably due to some optimisation on OpenAI's end (although I'm sure they had business reasons for dropping the price as well).

Small models have also been improving steadily in ability, so it is feasible that a task that needs Claude Opus today could be done by Sonnet in a year's time. This trend of model "efficiency" will add on top of compute getting cheaper.

Although, that efficiency would probably be quickly eaten up by increased appetites for higher performance, bigger, models.

lacker•23m ago

Will the other models really catch up, though? To me it seems like Anthropic's lead in programming has increased over the past year. Isn't it possible that over time, some models just become fundamentally better at some things than other models?

danny_codes•17m ago

I mean, not based on anything we’ve seen so far in the DL space. The algorithms are public, the compute is fungible: the only differentiator is data. But deepseek demonstrates that it’s somewhat easy to siphon data off other models so… yeah unclear where the moat is.

andix•1h ago

I guess flat fee AI subscriptions are not a thing that is going to work out.

Probably better to stay on usage based pricing, and just accept that every API call will be charged to your account.

bgwalter•1h ago

“It just stopped the ability to make progress,” the user told TechCrunch. “I tried Gemini and Kimi, but there’s really nothing else that’s competitive with the capability set of Claude Code right now.”

This is probably another marketing stunt. Turn off the flow of cocaine and have users find out how addicted they are. And they'll pay for the purest cocaine, not for second grade.

ceejayoz•1h ago

It was always gonna be the Uber approach. Cheap and great turns to expensive and mediocre when they have to turn the money spigot on.

bad_haircut72•51m ago

I went from pro to max because I hve been hitting limits, I could tell they were reducing it because I used to go multiple hours on pro but now its like 3. Congrats Anthropic you got $100 more out of me, at the cost of irrecoverable goodwill

hellcow•43m ago

For what it's worth, when Cursor downgraded their Claude limits in the middle of my annual subscription term, I emailed them to ask for a pro-rated refund, and it was granted. You may be able to do something similar with Claude Code.

Changing the terms of the deal midway through a subscription to make it much less valuable is a really shady business practice, and I'm not sure it's legal.

ants_everywhere•45m ago

The other day I was doing major refactorings on two projects simultaneously while doing design work for two other projects. It occurred to me to check my API usage for Gemini and I had spent $200 that day already.

Users are no doubt working these things even harder than I am. There's no way they can be profitable at $200 a month with unlimited usage.

I think we're going to evolve into a system that intelligently allocates tasks based on cost. I think that's part of what openrouter is trying to do, but it's going to require a lot of context information to do the routing correctly.

memothon•42m ago

I made a quick site so you can see what tools are using the most context and help control it, totally free and in your browser.

https://claude-code-analysis.pages.dev/

WhyNotHugo•30m ago

I wish models which we can self-host at home would start catching up. Relying on hosted providers like this is a huge risk, as can be seen in this case.

I just worry that there’s little incentive for bit corporations to research optimising the “running queries for a single user in a consumer GPU” use case. I wonder if getting funding for such research is even viable at all.

vineyardmike•22m ago

We already have really strong models that run on a consumer GPU, and really strong frameworks and libraries to support them.

The issue is (1) the extra size supports extra knowledge/abilities for the model. (2) a lot of the open source models are trained in a way to not compete with the paid offerings, or lack the data set of useful models.

Specifically, it seems like the tool-use heavy “agentic” work is not being pushed to open models as aggressively as the big closed models. Presumably because that’s where the money is.

YmiYugy•19m ago

I think model providers would love to run their models on a single GPU. The latency and throughput of GPU interconnects is orders of magnitudes worse than accessing VRAM. Cutting out the latency would make the models much more efficient to run, they wouldn't have to pay for such expensive networking. If they got to run it on consumer GPUs even better. Consumer GPUs probably cost something like 5-10x less with regards to raw compute than data center ones. New coding optimized models for single GPUs drop all the time. But it's just a really hard problem to make them good and when the large models are still in the barely good enough phase (I wasn't using agents much before Sonnet 4) it's just not realistic to get something useful locally.

tho234i32242234•17m ago

Hardly surprising.

AWS Bedrock which seems to be a popular way to get access to Claude etc. while not having to go through another "cloud security audit", will easily run up ~20-30$ bills in half-hour with something like Cline.

Anthropic likely is making bank with this and can afford to lose the less-profitable (or even loss-making) business of lone-man developers.

Apollo-Soyuz 50th Anniversary: A Handshake that Transformed the Space Race

Threads Exporter – Export Threads Followers and Following

WhatsApp Chat Backup

Quadratic Forms Beyond Arithmetic

Checkout this poem I wrote, about anonymous care, with my workflow

Starlink Satellites Can Avoid Photo-Bombing an Observatory

'The Gods Must Be Crazy' (1985)

Laminar Flow Airfoil

AI Twerk

Hyperpb: 10x faster dynamic Protobuf parsing in Go

Single-seat districts hold promise of success in Upper House election

Matter vs. Force: Why There Are Two Types of Particles

John Wheeler and the "It from Bit" (2023)

Trinitite

An AI-Powered Spell Checker for the TI-84

Louisiana cancels $3B coastal repair funded by oil spill settlement

The Factorio Mindset

Kite Aerial Photography

Network Pinheads at CBS Are Ending 'The Late Show'

A.I. Is About to Solve Loneliness. That's a Problem

Felix Baumgartner, who jumped from edge of space, dies in paragliding accident

Stratospheric skydiver Felix Baumgartner dies in paraglider crash

If Stress Was a Dog

Is artificial intelligence turning off our minds?

New Weird

Upcoming deprecation of GitHub Command Palette feature preview

Ask HN: How to isolate a single voice in JavaScript?

We Built a Conspiracy Mapping Tool Inspired by the Artist Who Had the FBI Knock

The Interview Question That Tells Me Everything

Astronomers Discover Rare Distant Object in Sync with Neptune

Apollo-Soyuz 50th Anniversary: A Handshake that Transformed the Space Race

Threads Exporter – Export Threads Followers and Following

WhatsApp Chat Backup

Quadratic Forms Beyond Arithmetic

Checkout this poem I wrote, about anonymous care, with my workflow

Starlink Satellites Can Avoid Photo-Bombing an Observatory

'The Gods Must Be Crazy' (1985)

Laminar Flow Airfoil

AI Twerk

Hyperpb: 10x faster dynamic Protobuf parsing in Go

Single-seat districts hold promise of success in Upper House election

Matter vs. Force: Why There Are Two Types of Particles

John Wheeler and the "It from Bit" (2023)

Trinitite

An AI-Powered Spell Checker for the TI-84

Louisiana cancels $3B coastal repair funded by oil spill settlement

The Factorio Mindset

Kite Aerial Photography

Network Pinheads at CBS Are Ending 'The Late Show'

A.I. Is About to Solve Loneliness. That's a Problem

Felix Baumgartner, who jumped from edge of space, dies in paragliding accident

Stratospheric skydiver Felix Baumgartner dies in paraglider crash

If Stress Was a Dog

Is artificial intelligence turning off our minds?

New Weird

Upcoming deprecation of GitHub Command Palette feature preview

Ask HN: How to isolate a single voice in JavaScript?

We Built a Conspiracy Mapping Tool Inspired by the Artist Who Had the FBI Knock

The Interview Question That Tells Me Everything

Astronomers Discover Rare Distant Object in Sync with Neptune

Anthropic tightens usage limits for Claude Code without telling users

Comments