I couldn't believe how many requests I could get in. I wasn't using this full-time for an entire workweek, but I thought for sure I'd be running into the $20/month limits quickly. Yet I never did.
To be fair, I spent a lot of time cleaning up after the AI and manually coding things it couldn't figure out. It still seemed like an incredible number of tokens were being processed. I don't have concrete numbers, but it felt like I was easily getting $10-20 worth of tokens (compared to raw API prices) out of it every single day.
My guess is that they left the limits extremely generous for a while to promote adoption, and now they're tightening them up because it’s starting to overwhelm their capacity.
I can't imagine how much vibe coding you'd have to be doing to hit the limits on the $200/month plan like this article, though.
If you think a lot, you can spend hundreds of dollars easily.
Think beyond just saying "do this one thing".
What do you do for hours?
If all you're thinking about is code output, you're thinking too small.
https://www.anthropic.com/news/claude-4
It was given a task and it solved a problem by operating for 7 hours straight.
I'm paying for Max, and when I use the tooling to calculate the spend returned by the API, I can see it's almost $1k! I have no idea how much quota I have left until the next block. The pricing returned by the API doesn't make any sense.
I hire someone for say £5K/mo. They then spend $200/mo or is it a $1000/wk on Claude or whatevs.
Profit!
Known as the Uber model or Amazon vs Diapers.com
To be fair that was a little different; Amazon wanted to buy the parent company of Diapers.com so sold at a loss to tank the value of the company so they could buy it cheap.
I’ll assuming this is real and not trolling. Who are the customers? What kind of people spend that much? I know people using $200-300 models but this is 10x that!
What do you mean? That’s totally a good reason to be pissed off at them. I’m so tired of products that launch before they have a clear path to profitability.
there are like 15~ total pages of documentation.
There are two folders , one for the home directory and one for the project root. You put a CLAUDE.md file in either folder which essentially acts like a pre-prompt. There are like 5 'magic phrases' like "think hard", 'make a todo', 'research..' , and 'use agents' -- or any similar set of phrases that trigger that route.
Every command can be ran in the 'REPL' environment for instant feedback, it itself can teach you how to use the product, and /help will list every command.
The hooks document is a bit incomplete last I checked, but it's a fairly straightforward system, too.
That's about it -- now explain vi/vim/emacs/pycharm/vscode in a few sentences for me. The 'time sink' is like 4 hours for someone that isn't learning how to use the computer environment itself.
> It's not that hard to write boilerplate and standard llm auto-predict was 95% of the way to Claude Code, Continue, Aider, Cursor, etc without the extra headaches.
Uh, no. To start - yea, boilerplate is easy. But like a sibling comment to this one said - it's also tedious and annoying, let the LLM do it. Beyond that, though, is that if you apply some curiosity and that "anyone that already knows what they are doing" level prior knowledge you can use these tools to _learn_ a great deal.
You might think your way of doing things is perfect, and the only way to do them - but I'm more of the mindset that there's a lot of ways to skins most of these cats. I'm always open to better ways to do things - patterns or approaches I know nothing about that might just be _perfect_ for what I'm trying to do. And given that I do, in general, know what I'm asking it to do, I'm able to judge whether it's approach is any good. Sometimes it's not, no big deal. Sometimes it opens my mind to something I wasn't aware of, or didn't understand or know would apply to the given scenario. Sometimes it leads me into rabbit holes of "omg, that means I could do this ... over there" and it turns into a whole ass refactor.
Claude code has broadened my capabilities, professionally, tremendously. The way it makes available "try it out and see how it works" in terms of trying multiple approaches/libraries/databases/patterns/languages and how those have many times led me to learning something new - honestly, priceless.
I can see how these tools would scare the 9-5 sit in the office and bang out boilerplate stuff, or to those who are building things that have never been done before (but even then, there's caveats, IMO, to how effective it would/could be in these cases)... but to people writing software or building things (software or otherwise) because they enjoy it or because they're financial or professional lives depend on what they're building - absolutely astonishing to me anyone who isn't embracing these tools with open arms.
With all that said. I keep the MCP servers limited to only if I need it in that session and generally if I'm needing an MCP server in an on-going basis I'm better off building a tool or custom documentation around that thing. And idk about all that agent stuff - I got lucky and held out for Claude Code, dabbled a bit with others and they're leagues behind. If I need an agent I'ma just tap on CC, for now.
Context and the ability to express what you want in a way that a human would understand is all you need. If you screw either of those up, you're gonna have a bad time.
Very few folks are talking about using the LLMs to sharpen THE DEVELOPER.
Just today I troubleshot an issue that likely would’ve taken me 2-3 hours without additional input. I wrapped it up and put a bow on it in 15 minutes. Oh, and also wrote a CLI tool fix the issue for me next time. Oh and wrote a small write up for the README for anyone else who runs into it.
Like… if you’re not embracing these tools at SOME level, you’re just being willfully ignorant at this point. There’s no badge of honor for willfully staying stuck in the past.
I suspected that something like this might happen, where the demand will outstrip the supply and squeeze small players out. I still think demand is in its infancy and that many of us will be forced to pay a lot more. Unless of course there are breakthroughs. At work I recently switched to non-reasoning models because I find I get more work done and the quality is good enough. The queue to use Sonnet 3.7 and 4.0 is too long. Maybe the tools will improve reduce token count, e.g. a token reducing step (and maybe this already exists).
Also there's likely only so much fixed compute available, and it might be getting re allcoated for other uses behind the scene from time to time as more compute arrives.
https://github.com/anthropics/claude-code/issues/3572
Inside info is they are using their servers to prioritize training for sonnet 4.5 to launch at the same time as xAI dedicated coding model. xAI coding logic is very close to sonnet 4 and has anthropic scrambling. xAI sucks at making designs but codes really well.
Vibe limit reached. Gotta start doing some thinking.
I'm surprised, but know I shouldn't be, that we're at this point already.
Its not, but it does matter. LLMs, being next word guessers, perform differently with different inputs. Its not hard to imagine a feedback loop of bad code generating worse code and good code generating more good code.
My ability to get good responses from LLMs has been tied to me writing better code, docstrings, and using autoformatters.
Paid compilers and remotely acessible mainframes all over again - people apparently never learn.
I don't think this one is a good comparison.
Once you had the binary, the compiler worked forever[1]
The issue with them was around long term support for bugs and upgrade path as the language evolved.
---
[1] as long you had a machine capable of running/emulating the instruction set for the binary.
The destruction of spelling didn’t feel like doomsday to us. In fact, I think most people treated the utter annihilation of that skill as a joke. “No one knows how to spell anymore” - haha, funny, isn’t technology cute? Not really. We’ve gone up an order of magnitude, and not paying attention to how programming is on the chopping block is going to screw a lot of people out of that skill.
For developers who read and understand the code being generated, the tool could go away and it would only slow you down, not block progress.
And even if you don’t, it really isn’t a hard dependency on a particular tool. There are multiple competing tools and models to choose from, so if you can’t make progress with one, switch to another. There isn’t much lock-in to any specific tool.
I dunno, from my company or boss's perspective, there are definitely days where I've seriously considered just disappearing, demanding a raise, or refusing to work after the 3rd meeting or 17th Jira ticket. And I've seen cow orkers and friends do all three of those over my career.
(Perhaps LLMs are closer to replacing human developers that anyone has realized yet?)
…but idk how true that, I think it’s pretty clear that these companies are using the Uber model to attract customers, and the fact that they’re already increasing prices or throttling is kind of insane.
I also use gemini to try out trading ideas. For example, the other day I had gemini process google's latest quarterly report to create a market value given the total sum of all it's businesses. It valued google at $215. Then I bought long call options on google. Literally vibe day trading.
I use chat gpt sora to experiment with art. I've always been fascinated with frank lloyd wright and o4 has gotten good enough to not munge the squares around in the coonley playhouse image so that's been a lot of fun to mess with.
I use cheaper models & rag to automate categorizing of my transactions in Tiller. Claude code does the devops/python scripting to set up anything google cloud related so I can connect directly to my budget spreadsheet in google sheets. Then I use llama via openrouter + a complex RAG system to analyze my historical credit card data & come up with accurate categorizations for new transactions.
This is only scratching the surface. I now use claude for devops, frontend, backend, fixing issues with embedder models in huggingface candle. The list is endless.
Are you doing a lot of broad throwaway tasks? I’ve had similar feelings when writing custom code for my editor, one off scripts, etc but it’s nothing I would ever put my professional reputation behind.
If your friend is consuming massive amounts of other dev time in PR reviews, maybe he has other issues. I'm willing to bet even without agentic coding, he would still be problem for your coworkers.
Sometimes I do broad throwaway tasks. For example I needed a rust lambda function that would do appsync event authorization for jwt tokens. All it needed to do was connect to aws secrets, load up the keys & check inbound requests. I basically had claude-code do everything from cdk to building/testing the rust function & deploying to staging. It worked great! However, I've certainly had my fair share of f-ups like I recently tried doing some work on the frontend with claude code and didn't realize it was doing useEffect everywhere!! Whoops. So I had to adapt and manage 2-3x claude code instances extremely closely to prevent that from happening again.
To be effective with agentic coding, you have to know when to go high level and low level. And have to accept that sometimes agentic coders need a lot of help! It all depends on how much context you give it.
I find sonnet really useful for coding but I never even hit basic limits. at $20/mo. Writing specs, coming up with documentation, doing wrote tasks for which many examples exist in the database. Iterate on particular services etc.
Are these max users having it write the whole codebase w/ rewrites? Isn't it often just faster to fix small things I find incorrect than type up why I think it's wrong in English and have it do a whole big round trip?
One day my very first prompt in the morning was blocked. Super strange.
This is my experience: at some point the AI isn't converging to a final solution and it's time to finish the rest by hand.
If you find yourself going back and forth with the AI, you're probably not saving time over a traditional google search
Edit: and it basically never oneshots anything correctly
I have two big workflows: plan and implement. Plan follows a detailed workflow to research an idea and produce a planning document for how to implement it. This routinely takes $10-30 in API credits to run in the background. I will then review this 200-600 line document and fix up any mistakes or remove unnecessary details.
Then implement is usually cheaper, and it will take that big planning document, make all the changes, and then make a PR in GitHub for me to review. This usually costs $5-15 in API credits.
All it takes is for me to do 3-4 of these in one 5-hour block and I will hit the rate-limit of the $100 Max plan. Setting this up made me realise just how much scaffolding you can give to Opus and it handles it like a champ. It is an unbelievably reliable model at following detailed instructions.
It is rare that I would hit the rate-limits if I am just using Claude Code interactively, unless I am using it constantly for hours at a time, which is rare. Seems like vibe coders are the main people who would hit them regularly.
Whenever I use it, I typically do much smaller asks, eg “add a button here”, “make that button trigger a refresh with a filter of such state…”
I have also used this for creating new UI components or admin pages. One thing I have noticed is that the planning step is pretty good at searching through existing UI components to follow their patterns to maintain consistency. If I just asked Claude to make the change straight away, it often won't follow the patterns of our codebase.
But for UI components, adding new pages, or things like that, it is usually more useful just as a starting point and I will often need to go in and tweak things from there. But often it is a pretty good starting point. And if it's not, I can just discard the changes anyway.
I find this is not worth it for very small tasks though, like adding a simple button or making a small behaviour change to a UI component. It will usually overcomplicate these small tasks and add in big testing rigs or performance optimisations, or other irrelevant concerns. It is like it doesn't want to produce a very short plan. So, for things like this I will use Claude interactively, or just make the change manually. Honestly, even if it did do a good job at these small tasks, it would still seem like overkill.
https://github.com/shepherdjerred/scout-for-lol/blob/6c6a3ca...
K&R C is underspecified. And anyone who whines about AI code quality? Hold my beer, look at our 1980's source.
I routinely have a task manager feed eight parallel Claude Code Opus 4 sessions their next source file to study for a specific purpose, to get through all 57 faster. That will hit my $200 Max limit, reliably.
Of course I should just wait a year, and AI will read the code base all at once. People _talk_ like it does now. It doesn't. Filtering information is THE critical issue for managing AI in 2025.
The most useful tool I've written to support this effort is a tmux interface, so AI and I can debug together two terminal sessions at once: The old 32-bit code running on a Linode instance, and the new 64-bit code running locally on macOS. I wasn't happy with how the tools for this worked, that I could find online. It blows my mind to watch Opus 4 debug.
I'd be extremely surprised if Anthropic picked now of all times to decide on COGS optimisation. They potentially can take a significant slice of the entire DevTools market with the growth they are seeing, seems short sighted to me to nerf that when they have oodles of cash in bank and no doubt people hammering at their door to throw more cash at them.
I think Claude Code is a much better concept, the coding agent doesn't need to be connected to the IDE at all. Which also means you can switch even faster to a competitor. In that sense, Claude Code may have been a huge footgun. Gaining market share might turn out to be completely worthless.
And I was thinking to myself, “How does this make any sense financially for Anthropic to let me have all of this for $200/month?”
And then I kept getting hit with those overloaded api errors so I canceled my plan and went back to API tokens.
I still have no idea what they’re doing over there but I’ll happily pay for access. Just stop dangling that damn $200/month in my face if you’re not going to honor it with reasonable access.
They've just done the work to tailor it specifically for proper tool using during coding. Once other models catch up, they will not be able to be so stingy on limits.
Google has the advantage here given they're running on their own silicon; can optimize for it; and have nearly unlimited cashflows they can burn.
I find it amusing nobody here in the comments can understand the scaling laws of compute. It seems like people have a mental model of Uber burned into their head thinking that at some point the price has to go up. AI is not human labor.
Over time the price of compute will fall, not rise. Losing money in the short term betting this will happen is not a dumb strategy given it's the most likely scenario.
I know everybody really wants this bubble to pop so they can make themselves feel smart for "calling it" (and feel less jealous of the people who got in early) and I'm sure there will be a pop, but in the long term this is all correct.
Currently it's much more important to manage context, split tasks, retry when needed, not getting stuck in an infinite loop, expose the right tools (but not too many), ...
Are they really ready to burn money for 8 years?
Amazon operated at a loss for 9 years, and barely turned a profit for over a decade longer than that. They're now one of the greatest businesses of all time.
Spotify operated at a loss for 17 years until becoming profitable. Tesla operated at a loss for 17 years before turning a profit. Palantir operated at a loss for 20 years before turning a profit.
And this was before the real age of Big Tech. Google has more cashflows they can burn than any of these companies ever raised, combined.
Uber operated at a loss to destroy competition and raised prices after they did that.
Amazon (the retailer) did the same and leveraged their position to enter new more lucrative markets.
Dunno about Spotify, but Tesla and palantir both secured lucrative contracts and subsidies.
Anthropic is against companies with deeper pockets and can’t spend to destroy competition, their current business model can only survive if they reduce costs or raise prices. Something’s got to give
Re: Anthropic specifically, I tend to agree, hence why I'm saying the deeper pockets (eg. Google, Amazon, etc) are perfectly positioned to win here. However, big companies have a way of consistently missing the future due to internal incentive issues. Google is deathly afraid of cannibalizing their existing businesses.
Plus, there's many investors with deep pockets who would love to get in on Anthropic's next round if their technical lead proves to be durable over time (like 6 months in AI terms).
This fight is still early innings.
Small models have also been improving steadily in ability, so it is feasible that a task that needs Claude Opus today could be done by Sonnet in a year's time. This trend of model "efficiency" will add on top of compute getting cheaper.
Although, that efficiency would probably be quickly eaten up by increased appetites for higher performance, bigger, models.
The NLP these models can do is definitely impressive, but they aren't 'thinking'. I find myself easily falling into the habit of filtering a lot of what the model returns and picking out the good parts which is useful and relatively easy for subjects I know well. But for a topic that I am not as familiar with, that filtering (identifying and dismissing) I do is much less finessed, and a lot of care needs to be taken to not just accept what is being presented. You can still interrogate each idea presented by the LLM to ensure you aren't being led astray, and that is still useful for discovering things, like traditional search, but once you mix agents into this, things can go off the rails far too quickly than I am comfortable with.
I don't subscribe to the $100 a month plan, I am paying API usage pricing. Accordingly I have learned how to be much more careful with Claude Code than I think other users are. The first day I used it, Claude got stuck in a loop trying to fix a problem using the same 2 incorrect solutions again and again and burnt through $30 of API credits before I realized things were very wrong and I stopped it.
Ever since then I've been getting away with $3-$5 of usage per day, and accomplishing a lot.
Anthropic needs to find a way to incentivize developers to better use Claude Code, because when it goes off the rails, it really goes off the rails.
Probably better to stay on usage based pricing, and just accept that every API call will be charged to your account.
This is probably another marketing stunt. Turn off the flow of cocaine and have users find out how addicted they are. And they'll pay for the purest cocaine, not for second grade.
Changing the terms of the deal midway through a subscription to make it much less valuable is a really shady business practice, and I'm not sure it's legal.
Users are no doubt working these things even harder than I am. There's no way they can be profitable at $200 a month with unlimited usage.
I think we're going to evolve into a system that intelligently allocates tasks based on cost. I think that's part of what openrouter is trying to do, but it's going to require a lot of context information to do the routing correctly.
I just worry that there’s little incentive for bit corporations to research optimising the “running queries for a single user in a consumer GPU” use case. I wonder if getting funding for such research is even viable at all.
The issue is (1) the extra size supports extra knowledge/abilities for the model. (2) a lot of the open source models are trained in a way to not compete with the paid offerings, or lack the data set of useful models.
Specifically, it seems like the tool-use heavy “agentic” work is not being pushed to open models as aggressively as the big closed models. Presumably because that’s where the money is.
AWS Bedrock which seems to be a popular way to get access to Claude etc. while not having to go through another "cloud security audit", will easily run up ~20-30$ bills in half-hour with something like Cline.
Anthropic likely is making bank with this and can afford to lose the less-profitable (or even loss-making) business of lone-man developers.
PMF.
They will turn you into an AI junkie who no longer has motivation to do anything difficult on your own (despite having the skills and knowing how), and then, they will dramatically cut your usage limit and say you’ll need to pay more to use their AI.
And you will gladly pay more, because hey you are getting paid a lot and it’s only a few hundred extra. And look at all the time you save!
Soon you’re paying $2k a month on AI.
jablongo•5h ago
Capricorn2481•5h ago
This is going to be happening with every AI service. They are all burning cash and need to dumb it down somehow. Whether that's running worse models or rate limiting.
rancar2•5h ago
WJW•4h ago
micromacrofoot•4h ago
TrueDuality•4h ago
I'm only on the $100 Max plan and stick to the Sonnet model and I'll run into the hard usage limits after about three hours, that's been down to about two hours recently. The resets are about every four hours.