A flat pricing subscription for Claude Code

https://support.anthropic.com/en/articles/11145838-using-claude-code-with-your-max-plan

234•namukang•5mo ago

Comments

ghuntley•5mo ago

the new Claude code “max plan” would last me all of [1] 5mins… I don’t get why people are excited about this. High powered tools aren’t cheap and aren’t for the consumer…

[1] https://www.youtube.com/live/khr-cIc7zjc?si=oI9Fj33JBeDlQEYG

iLoveOncall•5mo ago

If that's the case you should stop using it, because there's no way you see any ROI when you spend that much to just do some coding stuff.

It would be cheaper to your company to literally pay your salary while you do nothing.

postalrat•5mo ago

I'd love to see your math.

pclmulqdq•5mo ago

It's pretty simple: that usage in 5 min is probably at least $10 worth of API credits in that time (maybe $100).

A year has 2000 working hours, which is 24000 5-minute intervals. That means the company spending at least $240,000 on the Claude API (conservatively). So they would be better off having $100-200k you do nothing and hiring someone competent for that $240k.

F7F7F7•5mo ago

Claude Max is less than a 1/2 percentage point of a Jr. Devs average salary. If you can't make that work then....

justanotheratom•5mo ago

I am sure this is worth every dime, but my workflow is so used to Cursor now (cursor rules, model choice, tab complete, to be specific), that I can't be bothered to try this out.

s17n•5mo ago

If you're using Cursor with Claude it's gonna be pretty much the same thing. Personally I use Claude Code because I hate the Cursor interface but if you like it I don't think you're missing much.

justanotheratom•5mo ago

I don't enjoy the interface as such, rather the workflows that it enables.

tkzed49•5mo ago

The problem is that this is $100/mo with limits. At work I use Cursor, which is pretty good (especially tab completion), and at home I use Copilot in vscode insiders build, which is catching up to Cursor IMO.

However, as long as Microsoft is offering copilot at (presumably subsidized) $10/mo, I'm not interested in paying 10x as much and still having limits. It would have to be 10x as useful, and I doubt that.

tkzed49•5mo ago

I'll add on to this: I don't really use agent modes a lot. In an existing codebase, they waste a lot of my time for mixed results. Maybe Claude Code is so much better at this that it enables a different paradigm of AI editing—but I'd need easy, cheap access to try it.

koakuma-chan•5mo ago

> but I'd need easy, cheap access to try it.

You can try it for cheap with the normal pay-as-you-go way.

warp•5mo ago

You don't need a max subscription to use Claude Code. By default it uses your API credits, and I guess I'm not a heavy AI user yet (for my hobby projects), but I haven't spent more than $5/month on Claude Code the past few months.

EnPissant•5mo ago

I spent $5 in 10 minutes when I tried it.

christophilus•5mo ago

For me, it was $10 in 2 hours. That’s super cheap if it saves me significant time. Jury’s out on that, though.

koakuma-chan•5mo ago

The problem with it is that it uses a 30k~ token system prompt (albeit "cached"), and very quickly the usage goes up to a few million. I can easily spend over $10 a day.

F7F7F7•5mo ago

I burned $30 in Claude Code in just under an hour. I was equally frustrated and impressed. So much so I ended up a $200 MAX subscriber.

hombre_fatal•5mo ago

The money starts adding up fast as your context fills up since it's resending the whole accumulated context back through the api every time.

They're good about telling you how full your context is, and you can use /compact to shrink it down to the essentials.

But for those of us who aren't Mr. MoneyBags like you all, keeping an eye on context size is key to keeping costs low.

logankeenan•5mo ago

I’ve been wanting to try Claude Code. What makes it such a difference maker compared to existing AI tools?

PeterStuer•5mo ago

Can I assume you are still running into rate limits?

divan•5mo ago

AI Agent should be treated like a human developer. If you bring a new human developer to your codebase and give them a task it will take a lot of time to read and understand the codebase before making proper solution. If you want to use AI agent regularly it makes sense to have some sort of memory of the codebase.

And it seems like community realizes it and invents different solutions. RooCode has task orchestration built in already, there is a claude task-manager that allows splitting and remembering tasks so AI agent can pick it up quicker, there are different solutions with files like memory bank. Windsurf cursor upgraded their .widsurf/rules functionality to allow more solutions like that for instructing AI agents about the codebase/tasks. Some people even write their own scripts that feed every file to LLM and store the summary description in the separate file that AI agent tool can use instead of searching codebase.

I'm eager to see how some of these solutions will become embedded into every AI agent solution. It's one of the missing stones to make AI agents order of magnitude more efficient and productive.

ramoz•5mo ago

Doesn’t resonate with me because I’ve spent over $1,000 on Claude Code at this point and the return is worth it. The spend feels cheap compared to output.

In contrast - I’m not interested in using cheaper, less-than, services for my livelihood.

tkzed49•5mo ago

hey, I'm open to that possibility. Maybe I'll grab $5 in API credit and give it a shot (for 5 minutes or a week depending on who you ask)

keerthiko•5mo ago

i got $100 of credit at the start of the year, and have been using +1$ each month, starting at $2 in january using aider at the time. just switched to claude code this week, since it follows a similar UX. agentic CLI code assist really has been growing in usefulness for me as i get faster at reviewing its output.

i use it for very targeted operations where it saves me several roundtrips to code examples and documentation and stack overflow, not spamming it for every task i need to do, i spend about $1/day of focused feature development, and it feels like it saves me about 50% as many hours as i spend coding while using it.

stavros•5mo ago

What do you prefer, between Aider and CC? I use Aider for when I want to vibe code (I just give the LLM a high-level description and then don't check the output, because it's so long), and Cursor when I want to AI code (I tell the AI to do low-level stuff and check every one of the five lines it gives me).

AI coding saves me a lot of time writing high-quality code, as it takes care of the boilerplate and documentation/API lookups, while I still review every line, and vibe coding lets me quickly do small stuff I couldn't do before (e.g. write a whole app in React Native), but gets really brittle after a certain (small) codebase size.

I'm interested to hear whether Claude Code writes less brittle code, or how you use it/what your experience with it is.

vidarh•5mo ago

I tested Aider a few times, and gave up because at the time it was so bad - it might be time to try it again, and I'll add that my experience with seeing how Claude Code works for me while lots of other people struggle with it suggests to me that my experience with Aider might well be that my style of working just meshes better with Claude Code than Aider.

Claude Code was the first assistant that gelled for me, and I use it daily. It wrote the first pass of multi-monitor support for my window manager. It's written the last several commits of my Ruby X11 bindings, including a working systray example, where it both suggested the whole approach and implemented it, and tested it with me just acting as a clicking monkey (because I haven't set up any tooling to let it interact with the GUI) when it ran test scripts.

I think you just needs to test the two side by side and see what works for you.

I intend to give Aider a go at some point again, as I would love to use an open source tool for this, but ultimately I'll use the one that produces better results for me.

stavros•5mo ago

Makes sense, thanks. I've used Claude Code but it goes off on its own too much, whereas Aider is more focused. If you do give Aider another shot, use the architect/editor mode, with Gemini 2.5 Pro and Claude 3.7, respectively. It's produced the best results for me.

vidarh•5mo ago

A couple of tips if you're just starting with it:

The two worst ways of burning API credits I've found with Claude Code are:

1. Getting argumentative/frustrated with the model if it goes off the rails and continuing to try to make something work when the model isn't getting anywhere.

If it really isn't getting something in the first few prompts, stop and rethink. Can you go back and set a smaller task? Like writing test cases that it's broken approach would fail? If it's not making forward progress after a couple of prompts, it's not likely to unless you split up the task and/or provide more details. This is how you burn $10 instead of $0.60 for a task that "should" be simple. It's bad at telling you something is hard.

2. Think about when you either /compact (trims the context but retains important details) or clear the context entirely. E.g. always clear when moving to another task unless they're closely related. Letting it retain a long context is a surefire way of burn through a lot (and it also slows you down a lot, not least because there's a bug that affects some of us - maybe related to TERM settings? no idea - where in some cases it will re-print the entire history to the terminal, so between tasks it's useful to quit and restart)

Also use /init, but also ask it to update CLAUDE.md with lessons learned regularly. It's pretty good at figuring things out, such as how my custom ORM for a very unusual app server I'm working on works, but it's a massive waste of tokens to have it re-read the ORM layer every time instead of updating CLAUDE.md.

UncleEntity•5mo ago

> If it really isn't getting something in the first few prompts, stop and rethink. Can you go back and set a smaller task? Like writing test cases that it's broken approach would fail?

This.

I was fighting with Claude for a good chunk of yesterday (usage limits seemed broken so it didn't really time me out) and most of that was getting it to fix one small issue with three test cases. It would fix one test and break the others, round and round we go. After it broke unrelated tests I had to back out all the changes and, by then, I understood the problem well enough so could direct it how to fix it with a little help from Deepseek.

As there are a bunch of other sections of code which suffer from the same problem I can now tell it to "look at the fixed code and do it like that" so, hopefully, it doesn't flail around in the dark as much.

Admittedly, this is fairly complicated code, being an AST to bytecode compiler with a bunch of optimizations thrown in, and part of the problem was a later optimization pass undoing the 'fixes' Claude was applying which took quite a while to figure out.

Now I just assume Claude is being intentionally daft and treat it as such with questions like "why would I possibly want a fix specifically designed to pass one test instead of a general fix for all the cases?" Oh, yeah, that's its new trick, rewriting the code to only pass the failing test and throwing everything else out because, why not?

vidarh•5mo ago

> Now I just assume Claude is being intentionally daft and treat it as such with questions like "why would I possibly want a fix specifically designed to pass one test instead of a general fix for all the cases?" Oh, yeah, that's its new trick, rewriting the code to only pass the failing test and throwing everything else out because, why not?

The best one I've seen is when it tries to disable a test because it can't make the code pass it.

You do need to treat it as if it's trying to sneak stuff past you sometimes because you do get the occasional bout of what in a human I'd have treated as "malicious compliance", at a level well beyond stupidity.

satvikpendem•5mo ago

> the return is worth it

I'm curious, what was the return? What did you do with the 1k?

edoceo•5mo ago

Produce working code faster => ship faster => paid faster? That's the valu-prop right? So, naturally the $JOB will cover the bill.

bdangubic•5mo ago

so you didn’t spend a penny? :)

ramoz•5mo ago

something like that. Think "paid more" as well

satvikpendem•5mo ago

So just for work then or personal projects too? For work I can understand but for personal projects I haven't necessarily gotten more success out of AI than my own code, to be honest.

vidarh•5mo ago

In terms of personal projects, I use my own custom Ruby X11 window manager, and when I moved and got the office space for an extra monitor, Claude Code wrote the basics of the multi-monitor support by itself.

It's notable to me because there are to my knowledge no other Ruby wm's (there's at least one that allows scripting with Ruby, I believe, but not the whole codebase), the X11 bindings are custom (no Xlib or XCB), and there are few great examples that fits into the structure of my wm. Yet it made it work. The code was ugly, and I haven't committed it yet as I want to clean it up (or get Claude to) but my priority was to be able to use the second monitor without spending more than a few hours on it, and starting with no idea how multi-monitor support in X11 worked.

Since then, Claude Code has added Xinerama support to my X11 bindings, and selection support to enable a systray for my pager, and written the systray implementation (which I also didn't have the faintest clue how worked, and so had Claude explain to me before starting).

I use it for work too, but for these personal projects priority has been rough working code over beauty, because I use them every day and rely on the features, and want to spend as little time as possible on them, and so the work has been very different from how I work with Claude for work projects where I'll work in much smaller chunks, polish the result etc.

hcnews•5mo ago

Could you anonymize and share your last 5-10 prompts? Just wanna understand how people are using Claude Code.

winrid•5mo ago

"Ensure all our crons publish to telegraf when they start and finish. Include the cron name and tenant id when applicable. For crons that query batch jobs, only publish and take a lock when there is work to do. look at <snip> as an example. Here is the complete list to migrate. Create a todo list and continue until done. <insert list of 40 file paths>"

(updated for better example)

winrid•5mo ago

The thing I forgot is the command for it to get the next set of files to process. Otherwise it will migrate 30% of them and say "look dad, I'm done!"

csomar•5mo ago

I used it yesterday to convert a website from tailwind v1 to v4. Gave it the files (html/scss/js), links to tailwind and it did the job. Needed some back and forth and some manual stuff but overall it was painless.

It is not a challenging technical thing to do. I could have sat there for hours reading the conversion from v1 to v2 to v3 to v4. It is mostly just changing class names. But these changes are hard to do with %s/x/x, so you need to do them manually. One by One. For hundreds of classes. I could have as easily shot myself in the head.

> Could you anonymize and share your last 5-10 prompts?

The prompt was a simple "convert this site from tailwind v1 to v4". I use neovim copilot chat to inject context and load URLs. I have found that prompts have no value, it is either something the LLM can do or not.

ramoz•5mo ago

These aren't that fun but sure.

- https://gist.github.com/backnotprop/ca49f356bdd2ab7bb7a366ef...

- https://gist.github.com/backnotprop/d9f1d9f9b4379d6551ba967c...

- https://gist.github.com/backnotprop/e74b5b0f714e0429750ef6b0...

- https://gist.github.com/backnotprop/91f1a08d9c27698310d63e06...

- https://gist.github.com/backnotprop/7f7cb63aceb7560e51c02a9d...

- https://gist.github.com/backnotprop/94080dde34bfca3dd9c48f14...

- https://gist.github.com/backnotprop/ea3a5c3a31799236115abc76...

Taken from 2 recent systems. 90% of my interaction is assurance, debugging, and then having claude operate within the meta context management framework. We work hard to set the path for actual coding - thus code output (even complex or highly integrated) usually ends up being fairly smooth+fast.

When I "wake" CC up I usually use a prompt like this to preface any complex work: https://gist.github.com/backnotprop/d2e4547fc4546eea071b9b68... (the goal is to get all relevant context in-memory).

For most planning I use Gemini. I copy either the entire codebase (if less than ~200k tokens) or select only that parts that matter for the task in large codebases. I built a tool to help me build these prompts, keep the codebase organized well in xml structure. https://github.com/backnotprop/prompt-tower

smartbit•5mo ago

Interesting. Thanks.

Could you explain why there is no punctuation?

ramoz•5mo ago

Ah yea sorry that is an export error... I copied prompts directly out of Claude Code and when I do that it copies all of the ascii/tui parts that wrap the message... I used some random "strip special chars" site to remove those and was lazy about adding actual punctuation back in.

ramoz•5mo ago

worth noting that some of the prompts are related to the project context management system i use: (obfuscated business details) https://gist.github.com/backnotprop/4a07a7e8fdd76cbe054761b9...

dahcryn•5mo ago

Have I got bad news for you.... Microsoft announced imposing limits on "premium" models from next week. You get 300 "free" requests a month. If you use agent, you consume about 3-4 requests per action easily, I estimate to burn through 300 in about 3-5 working days.

Basically anything that isnt gpt4o is premium, and I find gpt4o near useless compared to Claude and Gemini in copilot.

maven29•5mo ago

Enforcement of Copilot premium request limits moved to June 4, 2025 https://github.blog/changelog/2025-05-07-enforcement-of-copi...

pier25•5mo ago

> I find gpt4o near useless compared to Claude and Gemini in copilot.

It's a hit and miss IMO.

I like it for C#/dotnet but completely useless for the rest of the stuff I do (mostly web frontend).

I'm not sure about my usage but if I hit those premium limits I'm probably going to cancel Copilot.

debian3•5mo ago

The default unlimited model is now gpt 4.1 https://github.blog/changelog/2025-05-08-openai-gpt-4-1-is-n...

richardw•5mo ago

Whoever is paying for your time should calculate how much time you’d save between the different products. The actual product price comparison isn’t as important as the impact on output quality and time taken. Could be $1000 a month and still pay for itself in a day, if it generated >$1000 extra value.

This might mean the $10/month is the best. Depends entirely on how it works for you.

(Caps obviously impact the total benefit so I agree there.)

zeroq•5mo ago

Just today I had yet another conversation about how BigCo doesn't give a damn about cost.

Just to give you one example - last BigCo I worked for had a schematic for new projects which resulted in... 2k EUR per month cloud cost for serving a single static html file.

At one point someone up top decided that kubes is the way to go and scrambled an impromptu schematic for new projects which could be simply described as a continental class dreadnought of a kubernetes cluster on AWS.

And it was signed off, and later followed like a scripture.

Couple stories lower we're having hard time arguing for 50 EUR budget for a weekly beer for the team, but the company is A fine with paying 2K EUR for a landing page.

guappa•5mo ago

C-suites love to pretend they can do and indeed do these calculations.

They don't. They toss a coin.

sensanaty•5mo ago

I've seen the books as to how much we spend on all the various AI shit. I can guarantee, that at least in our co, that AI is a massive waste of money.

But it doesn't really matter, because the C-level has been consumed by the hype like nothing I've ever seen. It could cost an arm and a leg and they'd still be pushing for it because the bubble is all-consuming, and anyone not touting AI use doesn't get funding from other similarly clueless and sucked-in VCs.

Aurornis•5mo ago

> The problem is that this is $100/mo with limits

Limits are a given on any plan. It would be too easy for a vibe coder to hammer away 8 hours a day for 20 days a week if there was nothing stopping them.

The real question is whether this is a better value than pay as you go for some people.

htrp•5mo ago

> 8 hours a day for 20 days a week

Your vibe coders are on a different dimension than mine.

DonHopkins•5mo ago

Wow, and I thought the Beatles were exaggerating how long a week was!

https://www.youtube.com/watch?v=kle2xHhRHg4

cornel_io•5mo ago

I've often run multiple Claude Code sessions in parallel to do different tasks. Burns money like crazy, but if you can handle wrangling them all there's much less sitting and waiting for output.

PeterStuer•5mo ago

What's the problem if you aren't selling the service as a loss leader and that 'vibe coder' is paying into their account upfront?

Only reason I can see is you're lacking agregate capacity and are unwilling or unable to build out faster. Is that the case?

selcuka•5mo ago

> I'm not interested in paying 10x as much and still having limits. It would have to be 10x as useful

I don't think this is the right way to look at it. If CoPilot helps you earn an extra $100 a month (or saves you $100 worth of time), and this one is ~2x better, it still justifies the $100 price tag.

rafaelmn•5mo ago

At 10-20$ a month that calculation is trivial to make. At a 100$ I'm honestly not getting that much value out of AI, especially not every month, and especially not compared to cheaper versions.

edmundsauto•5mo ago

I think this thinking is flawed. First, it presupposes a linear value/cost relationship. That is not always true - a bag that costs 100x as much is not 100x more useful.

Additionally, when you’re in a compact distribution, being 5% better might be 100x more valuable to you.

Basically, this assumes that the marginal value is associated with cost. I don’t think most things, economically, seem to match that pattern. I will sometimes pay 10x the cost for a good meal that has fewer calories (nutritional value)

I am glad people like you exist, but I don’t think the proposition you suggest makes sense.

asaddhamani•5mo ago

I don’t know why people expect unlimited usage for limited cost. Copilot hasn’t been good for a long time. They had the first mover advantage but they were too slow to improve the product. It’s still not caught up to cursor or windsurf. Cline leaves it so far in the dust it’s like a decade behind in AI years. So you get what you pay for.

Claude is still the gold standard for AI assisted coding. All your Geminis and o3s of the world still don’t match up to Claude.

I started using Claude code once it became a fixed price with my Claude max subscription. And it’s taken a little getting used to vs Cline, but I think it’s closer to Cline in performance rather than cursor (Cline being my personal gold standard). $100 is something most people on this forum could make back in 1 day of work.

$100 per month for the value is nothing and for what it’s worth I have tried to hit the usage limit and the only thing that got me close was using their deep research feature. I’ve maxed out Claude code without hitting limits.

lelanthran•5mo ago

> $100 is something most people on this forum could make back in 1 day of work.

I expect so. The question is "How many days does the limit last for?"

Maybe they have a per-day limit, maybe it's per-month (I'm not sure), but paying $100/m and hitting the limit in the first day is not economical.

Loic•5mo ago

Right into the announcement, later down, they even explain how to handle the limits:

How Rate Limits Work: With the Max plan, your usage limits are shared across both Claude and Claude Code:

Shared rate limits: All activity in both Claude and Claude Code counts against the same usage limits.

Message variations: The number of messages you can send on Claude varies based on message length, conversation length, and file attachments.

Coding usage variations: Expected usage for Claude Code will vary based on project complexity, codebase size, and auto-accept settings.

On the Max plan (5x Pro/$100), average users:

- Send approximately 225 messages with Claude every 5 hours, OR

- Send approximately 50-200 prompts with Claude Code every 5 hours

On the Max plan (20x Pro/$200), average users:

- Send approximately 900 messages with Claude every 5 hours, OR

- Send approximately 200-800 prompts with Claude Code every 5 hours

ziofill•5mo ago

How many prompts does Claude code send per user prompt? Is it 1:1?

asaddhamani•5mo ago

Nope, it can be even a dozen (because agentic). Claude usage limits are actually based on token usage, and Claude Code uses a mix of Haiku and Sonnet. So your limits are split among those two models. I gave an estimation of how much usage you can expect in another comment on this thread, but you will find it hard to max out the $100 plan unless you are using it very, very extensively.

kridsdale1•5mo ago

I didn’t realize they were tuning cost optimization by switching models contextually. That’s very clever. I bet the whole industry of consumer LLM apps moves that way.

Loic•5mo ago

I am using cline, it plans with Haiku and execute with Sonnet. It works well.

asaddhamani•5mo ago

I wrote about this on my blog: https://www.asad.pw/llm-subscriptions-vs-apis-value-for-mone...

But basically you get ~300Mn input tokens and ~100Mn output tokens per month with Sonnet on the $100 plan. These are split across 50 sessions you are allowed, each session is 5 hrs starting from the first time you send a message until 5 hrs after the first message. During this time, you get ~6Mn input and ~2Mn output tokens for Sonnet. Claude Code seems to use a mix of Sonnet and Haiku, and Haiku has 2x the limits of Sonnet.

So if you absolutely maxed out your 50 sessions every month, that's $2400 worth of usage if you instead had used the API. So it's a great deal. It's not $100 worth of API credits you're buying, so they don't run out like that. You can exhaust limits for a given session, which is at most a 5 hr wait for your next one, or you can run out of 50 sessions, I don't know how strongly they enforce that limit and I think that limit is BS, but all in all the value for money is great, way better than using the API.

divan•5mo ago

Thanks for the link and explainer. My first experience with Claude Code left mixed feelings because of the pricing. I have Pro subscription, but for Claude Code can only use API mode. So I added 5$ just to check it, and exhausted 4.5$ in the first 8m session. It left me wondering if switching to Max plan will exhaust it at the same rate or not.

sixtyj•5mo ago

Exactly, $100 per month is nothing for professional usage. For hobby projects, it is a lot.

From the internet, we got used to get everything for nothing, thus ppl beg for a lower price, even if it doesn't make sense.

nprateem•5mo ago

It makes perfect sense if the market is cheaper.

microtonal•5mo ago

Claude is still the gold standard for AI assisted coding. All your Geminis and o3s of the world still don’t match up to Claude.

I might be missing something, but you can use Claude 3.7 in Copilot Chat:

https://docs.github.com/en/copilot/using-github-copilot/ai-m...

VS Code with your favorite model in Copilot is rapidly catching up with Cursor, etc. It's not there yet, but the trajectory is good.

(Maybe you meant code completion? But even smaller, local models do pretty well in code completion.)

nprateem•5mo ago

When I tried Claude in copilot it was so obviously crippled as to be useless. I deleted copilot and never went back.

miroljub•5mo ago

Care to explain why? Isn't the Claude version in Copilot exactly the same as in Claude Code?

nprateem•5mo ago

It was just obviously worse than using the anthropic website. That was the only explanation for why it was so bad. They could offer it free because it was stupid even if the same version (maybe less resources). Or maybe I was just unlucky but that's what it seemed to me.

asaddhamani•5mo ago

Sonnet in Copilot is crippled, Copilot agent mode is also very basic and failed every time I tried it. It would have been amazing 2 years ago, but now it's very meh.

GitHub is losing money on the subs, but they are definitely trying to reduce the bleed. One way to do that is to cut corners with LLM usage, by not sending as much context, trimming the context window, capping output token limits, these are all things Cursor also does btw, hence why Cline, with almost the same tech (in some ways its even inferior tech) achieves better results. I have hit $20 in API usage within a single day with Cline, Cursor lets you have "unlimited" usage for $20 for a month. So its optimised for saving costs, not for giving you the best experience. At $10 per month for Copilot, they need to save costs even more. So you get a bad experience, you think its the AI that is not capable, but the problem is with the companies burning VC money to corner the market, setting unrealistic expectations on pricing, etc.

bugglebeetle•5mo ago

Gemini 2.5 Pro is better at coding than Claude, it’s just not as good at acting agentically, nor does Google have good tooling to support this use case. Given how quickly they’ve come from far behind and their advantage on context size (Claude’s biggest weakness), this could change just as fast, although I’m skeptical they can deliver a good end user dev tool.

arunix•5mo ago

How much do you pay for Gemini 2.5 Pro?

childintime•5mo ago

Something like $20/month, first 2 months $10. Depends on the country.

MaxikCZ•5mo ago

> Gemini 2.5 Pro is better at coding than Claude

Id be careful with stating things like these as fact. I asked Gemini for half an hour to write code that draws a graph the way I want, it never got it right. Then I asked Cladue 3.7 and it got it almost right the first try, to the point I thought its compeltely right, and fixed the bug I discovered right after I pointed it out.

virtualmic•5mo ago

Yup, I have had similar experience too. Not only for coding, but just yesterday, I was asking Gemini to compose an email with a list of attachments, which I had specified as a list of file paths in the prompt, and it wasn't able to count correctly and report in the email text (the text went something like, there are <number_of_attachments> charts attached). Claude 3.7 was able to do that correctly in one go.

mellosouls•5mo ago

I agree with much of your post, but:

Claude is still the gold standard for AI assisted coding. All your Geminis and o3s of the world still don’t match up to Claude.

Out of date I think in this fast moving space.

Sonnet has long been the gold-standard, but that position is looking very shaky at the moment; Gemini in particular has been working wonders for me and others when Sonnet has stumbled.

VS Code/Copilot has improved massively in Cursor's wake, but yes, still some way to go to catch up.

Absolutely though - the value we are getting is incredible.

asaddhamani•5mo ago

In my experience, there are areas where Gemini did well but Claude didn't, same for o1 pro or o3, but for 90% of the work, I find Claude way more trustworthy, better at following instructions, not making syntax mistakes, etc. Gemini 2.5 Pro is way better than all their prior models, but I don't get the hype about it being a coding superstar. It's not bad, but Sonnet is still the primary workhorse. Sonnet is more expensive, so if Gemini was at the same level I'd be happy to save the money, but unfortunately, I've tried it with various approaches, played with the temperature, but in the vast majority of cases Claude does a better job.

AznHisoka•5mo ago

What about Sourcegraph? How do they compare?

uludag•5mo ago

I always imagined that these $10/mo plans are essentially loss leaders and that in the long run, the price should be much higher. I'm not even sure if that $100/mo plan pays for its underlying costs.

wyre•5mo ago

I think their free tiers are by definition of loss leaders, but I think you're right, all of there offerings are loss leaders. I know I can get more from my $20 using Claude Pro than I can using their API Workbench. It is such a competitive space that I don't think its unrealistic for these companies to ever be cash positive because all cash they have needs to be spent on competing in this space.

I_am_tiberius•5mo ago

Do you still need a phone number to register with claude?

flynumber•5mo ago

Judging by your username you're in Israel?

If so just get yourself an Israeli mobile virtual number (which can receive SMS)

https://www.flynumber.com/cities/israel/mobile/

I_am_tiberius•5mo ago

Haha. No I'm not from Israel. Just a Star Trek fan:). But thanks for the info. My issue with that is that I don't want to give a way my phone number, and I certainly don't want to pay for a service that gives me a phone number.

abetaha•5mo ago

I wonder how successful this pricing model ($100-$200 a month with limits) is going to be. It is very hard to justify, when other tooling in the ~$20/month range offers unlimited usage, and comparable quality.

jsheard•5mo ago

Is any of the ~$20/month with unlimited usage tooling actually profitable though? It goes without saying that if all else is equal then the product sold at a greater loss will be more popular, but that only works until the vendor runs out of money to light on fire.

turnsout•5mo ago

Cursor keeps raising money… I for one personally enjoy burning all those VC dollars. Consider it a very tiny version of wealth redistribution.

slrainka•5mo ago

Agent mode without rails is like a boat without a rudder.

What worked for me was coming up with an extremely opinionated way to develop an application and then generating instructions (mini milestones) by combining it with the requirements.

These instructions end up being very explicit in the sequence of things it should do (write the tests first), how the code should be written and where to place it etc. So the output ended up being very similar regardless of the coding agent being used.

F7F7F7•5mo ago

I've tried every variation of this very thing. Even managed to build a quick and dirty ticketing system that I could assign to the LLM of my choosing. WITH context. Talking Graph Codebase's diagrams, mappings, tree structure of every possibility, simple documentation, complex documentation, a bunch of OSS to do this very thing automatically etcetcetc.

In the codebase I've tried modularity via monorepo, or faux microservices with local apis, monoliths filled with hooks and all the other centralized tricks in the book. Down to the very very simple. Whatever I could do to bring down the context window needed.

Eventually.....your return diminish. And any time you saved is gone.

And by the time you've burned up a context window and you're ready to get out. Now you're expeciting it to output a concise artifact to carry you to the next chat so you don't have to spend more context getting that thread up to speed.

Inevitably the context window and the LLMs eagerness to touch shit that it's not supposed (the likelihood of which increases with context) always gets in the way.

Anything with any kind of complexity ends up in a game of too much bloat or the LLM removing pieces that kill other pieces that it wasn't aware about.

/VENT

slrainka•5mo ago

So, relying on a large context can be tricky. Instead I’ve tried to get to a ER model quickly. And from there build modules that don’t have tight dependencies.

Using Gemini 2.5 for generating instructions

This is the guide I use

https://github.com/bluedevilx/ai-driven-development/blob/mai...

energy123•5mo ago

How many tokens (across whole codebase) did it take for diminishing returns to kick in? What does the productivity vs token plot look like?

cye131•5mo ago

I'm curious whether anyone's actually using Claude code successfully. I tried it on release and found it negative value for tasks other than spinning up generic web projects. For existing codebases of even a moderate size, it burns through cash to write code that is always slightly wrong and requires more tuning than writing it myself.

ramoz•5mo ago

Yes. For small apps, as well distributed systems.

You have to puppeteer it and build a meta context/tasking management system. I spend a lot of time setting Claude code up for success. I usually start with Gemini for creating context, development plans, and project tasking outlines (I can feed large portions of codebase to Gemini and rely on its strategy). I’ve even put entire library docsites in my repos for Claude code to use - but today they announced web search.

They also have todos built in which make the above even more powerful.

The end result is insane productivity - I think the only metric I have is something like 15-20k lines of code for a recent distributed processing system from scratch over 5 days.

broof•5mo ago

Can you share more about what you mean by a meta context/tasking management system? I’m always curious when I see people who have happily spent large amounts on api tokens.

conception•5mo ago

So i use roo and you have the architecture mode draft out in as much detail as you want, plans, tech stack choices, todos, etc. Switch to orchestration mode to execute the plan, including verifying things are done correctly. It sub tasks out the todos. Tell it to not bother you unless it has a question. Cone back in thirty and see how it’s doing. You can have it commit to a branch per task if you want. Etc etc.

ramoz•5mo ago

Here is some insight... I had gemini obfuscate my business context so if something sounds weird it is probably because of that.

https://gist.github.com/backnotprop/4a07a7e8fdd76cbe054761b9...

The framework is basically the instructions and my general guidance for updating and ensuring the details of critical information get injected into context. some of those prompts I commented here: https://news.ycombinator.com/item?id=43932858

meesles•5mo ago

Is that final number really that crazy? With a well defined goal, you can put out 5-8K per day by writing code the old fashioned way. Also would love to see the code, since in my experience (I use Cursor as a daily driver), AI bloats code by 50% or more with unnecessary comments and whitespace especially when making full classes/files.

> I spend a lot of time setting Claude code up for success.

Normally I wouldn't post this because it's not constructive, but this piece stuck out to me and had me wondering if it's worth the trade-off. Not to mention programmers have spent decades fighting against LoC as a metric, so let's not start using it now!

dttze•5mo ago

You'll never see the code. They will just say how amazingly awesome it is, how it will fundamentally alter how coding is done, etc... and then nothing. Then if you look into who posts it, they work in some AI related startup and aren't even a coder.

ramoz•5mo ago

Not open source but depending on certain context i can show whoever. im not hard to find.

Ive done just about everything across the full & distributed stack. So I'm down to jam on my code/systems and how I instruct & rely on (confidently) AI to help build them.

maccard•5mo ago

5k likes of code a day is 10 lines of code a minute solidly for 8 hours straight. Whatever way you cut that with white space, bracket alignment, that’s a pretty serious amount of code to chunk out.

n_ary•5mo ago

If I am writing Go, it is easy to generate enough if/else and error checks. When working in java, basic code can bloat to big LoC over several hours(first draft, which is obviously cleaned up later before going to PR). React and other FE frameworks also tend to require huge LoC count(mostly boilerplate and auto completed rather than thoughtfully planned and written). It is not that serious amount as you may think.

meesles•5mo ago

Nitpicking like this must be fair, if you look at typical AI code - styles, extra newlines, comments, tests/fixtures, etc. it is the same. And again LoC isn't a good measurement in the first place.

Not all my 5k lines are hand-written or even more than a character; a line can be a closing bracket etc. which autocomplete has handled for the last 20 years. It's definitely an achievement, which is why it's important to get clarity when folks claim to reach peak developer productivity with some new tools. To quote the curl devs, "AI slop" isn't worth nearly the same as thoughtful code right now.

coverj•5mo ago

are people really committing 5k lines a day without AI assistance even once a month?

I don't think I've ever done this or worked with anyone who had this type of output.

bcrosby95•5mo ago

It depends upon how well mapped out the problem is in your head. If it's an unfamiliar domain, no way.

zupa-hu•5mo ago

Maybe if you are copy-pasting some html templates, but then it is not “writing code”. Handwriting complex logic, at 5k sloc per day, no way.

infecto•5mo ago

Nobody is writing 5k consistently on a daily basis. Sure if it’s a bunch of boiler scaffolding maybe.

I daily drive cursor and I have rules to limit comments. I get comments on complex lines and that’s it.

_bin_•5mo ago

I'd be really interested in seeing the source for this, if it's an open-source project, along with the prompts and some examples. Or other source/prompt examples you know of.

A lot of people seem to have these magic incantations that somehow make LLMs work really well, at the level marketing and investor hype says they do. However, I rarely see that in the real world. I'm not saying this is true for you, but absent vaguely replicable examples that aren't just basic webshit, I find it super hard to believe they're actually this capable.

ramoz•5mo ago

Not open source but depending on certain context i can show you. im not hard to find.

senordevnyc•5mo ago

Aider writes 70-80% of its own code: https://aider.chat/HISTORY.html

NitpickLawyer•5mo ago

While not directly what you're asking for, I find this link extremely fascinating - https://aider.chat/HISTORY.html

For context, this is aider tracking aider's code written by an LLM. Of course there's still a human in the loop, but the stats look really cool. It's the first time I've seen such a product work on itself and tracking the results.

ed•5mo ago

Yes. It costs me a few bucks per feature, which is an absolute no-brainer.

If you don't like what it suggests, undo the changes, tweak your prompt and start over. Don't chat with it to fix problems. It gets confused.

thegeomaster•5mo ago

Absolutely stellar for 0-to-1-oriented frontend-related tasks, less so but still quite useful for isolated features in backends. For larger changes or smaller changes in large/more interconnected codebases, refactors, test-run-fix-loops, and similar, it has mostly provided negative value for me unfortunately. I keep wondering if it's a me problem. It would probably do much better if I wrote very lengthy prompts to micromanage little details, but I've found that to be a surprisingly draining activity, so I prefer to give it a shot with a more generic prompt and either let it run or give up, depending on which direction it takes.

singhrac•5mo ago

Here's a very small piece of I code I generated quickly (i.e. <5 min) for a small task (I generated some data and wanted to check the best way to compress it):

https://gist.github.com/rachtsingh/e3d2e2b495d631b736d24b56e...

Is it correct? Sort of; I don't trust the duration benchmark because benchmarking is hard, but the size should be approximately right. It gave me a pretty clear answer to the question I had and did it quickly. I could have done it myself but it would have taken me longer to type it out.

I don't use it in large codebases (all agentic tools for me choke quickly), but part of our skillset is taking large problems and breaking them into smaller testable ones, and I give that to the agents. It's not frequent (~1/wk).

energy123•5mo ago

> I don't use it in large codebases (all agentic tools for me choke quickly), but part of our skillset is taking large problems and breaking them into smaller testable ones, and I give that to the agents. It's not frequent (~1/wk).

Where is the breakpoint here? What number of lines of code or tokens in a codebase when it becomes not worth it?

petesergeant•5mo ago

30 to 40ish in my experience. Current state of the art seems to lack thinking well about programming tasks with a layer of abstraction or zooming out a little bit in terms of what might be required.

I feel like as a programmer I have a meta-design in my head of how something should work, and the code itself is a snapshot of that, and the models currently struggle with this big picture view, and that becomes apparent as they make changes. Entirely willing to believe that Just Add Moar Parameters could fix that (but also entirely willing to believe that there's some kind of current technical dead-end there)

jwr•5mo ago

> I don't use it in large codebases (all agentic tools for me choke quickly)

Claude code, too?

I found that it is the only one that does a good job in a large codebase. It seems to be very different from others I've tested (aider, plandex).

Implicated•5mo ago

Just to throw my experience in, it's been _wildly_ effective.

Example;

I'm wrapping up, right now, an updated fork of the PHP extension `phpredis` because Redis 8 recently was released with support for a new data type, Vector Set but the phpredis extension (which is far more performant that non-extension redis libraries for PHP) doesn't support the new vector-related commands. I forked the extension repo, which is in C (I'm a PHP developer, I had to install CLion for the first time just to work along with CC) and fired up claude code with the initial prompt/task of analyzing the extensions code and documenting the purpose, conventions, and anything that it (claude) felt would benefit the bootstrapping process of future sessions such that whole files wouldn't need to be read into a CLAUDE.md file.

This initially, depending on the size of the codebase, could be "expensive". Being that this is merely a PHP extension and isn't a huge codebase, I was fine letting it just rip through the whole thing however it saw fit - were this a larger codebase I'd take a more measured approach to this initial "indexing" of the codebase.

This results in a file that claude uses like we do a readme.

Next I end this session, start a new one and tell it to review that CLAUDE.md file (I specifically tell it to do this, every single new session start moving forward) and then generate a general overview/plan of what needs to be done in order to implement the new Vector Set related commands so that I can use this custom phpredis extension in my PHP environments. I indicated that I wanted to generate a suite of tests focused on ensuring each command works with all of it's various required and optional parameters and that I wanted to use docker containers for the testing rather than mess up my local dev environment.

$22 in API costs and ~6 hours spent and I have the extension, working, in my local environment with support for all of the commands I want/need to use. (there's still 5 commands that I don't intend to use that I haven't implemented)

Not only would I have certainly never embarked upon trying to extend a C PHP extension, I wouldn't have done so over the course of an evening and morning.

Another example:

Before this redis vector sets thing I used CC to build a python image and text embedding pipeline backed by Redis streams and Celery that consumes tasks pushed to the stream by my Laravel application that currently manages ~120 million unique strings and ~65 million unique images that I've been generating embeddings for. Prior to this I'd spent very little time with Python and zero with anything related to ML. Now I have a performant python service that's portable that I run from my Macbook (M2 Pro) or various GPU-having Windows machines in my home that generate the embeddings on an 'as available' basis, pushing the results back to a redis stream that my Laravel app then consumes and processes.

The results of these embeddings and the similarity-related features that they've brought to the Laravel application are honestly staggering. And while I'm sure I could have spent months stumbling through all of this on my own - I wouldn't have, I don't have that much time for side project curiosities.

Somewhat related - these similarity features have directly resulted in this side project becoming a service people now pay me to use.

On a day to do - the effectiveness is a learned skill. You really need to learn how to work with it in the same way you, as a layperson, wouldn't stroll up to a highly specialized piece of aviation technology and just infer how to use it optimally. I hate to keep parroting "skill issue" but - it's just wild to me how effective these tools are and how there's so many people who don't seem to be able to find any use.

If it's burning through cash, you're not being focused enough with it. If it's writing code that's always slightly wrong, stop it and make adjustments. Those adjustments likely/potentially need to be documented in something like I described above in a long-running document used similarly to a prompt.

From my own experience, I watch the "/settings/logs" route on anthropics website while CC is working once I know that we're getting rather heavy with the context. Once it gets into the 50-60,000 tokens range I either aim to wrap up whatever the current task is, or I understand that things are going to start getting a little wonky into the 80k+ range. It'll keep on working up into the 120-140k tokens or more - but you're likely going to end up with lots of "dumb" stuff happening. You really don't want to be here unless you're _sooooo close_ to getting done what you're trying to. When the context gets too high and you need/want to reset but you're mid task - /compact [add notes here about next steps] and it'll generate a summary that will then be used to bootstrap the next session. (Don't do this more than once, really, as it starts losing a lot of context - just reset the session fully after the first /compact)

If you're constantly running into huge contexts you're not being focused enough. If you can't even work on anything without reading files with thousands of lines - either break up those files somehow or you're going to have to be _really_ specific with the initial prompt and context - which I've done lots of. Say I have a model that belongs to a 10+ year old project that is 6000 lines long and I want to work on a specific method in that model - I'll just tell claude in the initial message/prompt which line that method starts on, ends on and what number of lines from the start of the model it should read (so it can get the namespace, class name, properties, etc) and then let it do it's thing. I'll tell it specifically not to read more than 50 lines of that file at a time when looking for something or reviewing something, or even to stop and ask me to locate a method/usages of things, etc rather than reading whole files into context.

So, again, if it's burning through money - focus your efforts. If you think you can just fire it up and give it a generic task - you're going to burn money and get either complete junk, or something that might technically work but is hideous, at least to you. But, if you're disciplined and try to set or create boundaries and systems that it can adhere to - it does, for the most part.

roselan•5mo ago

This is a great report. Using a claude.md like that is honestly genius.

wolfgangK•5mo ago

Most interesting ! Would you mind sharing the prompt and the resulting CLAUDE.md file ?

Thx !

ajsisbdjxbx•5mo ago

How do you know what it built is correct if you don’t know C?

oblio•5mo ago

They'll find out by the CVEs 1 year later.

Implicated•5mo ago

Aye, being that I have no idea what I'm doing I certainly won't be generating a PR of this code and am willing to take the risk of there being an issue from my adding (rather straightforwardly) a handful of commands to an already existing and stable extension.

jwr•5mo ago

I use it, on a large Clojure/ClojureScript application. And it's good.

The interactions and results are roughly in line with what I'd expect from a junior intern. E.g. don't expect miracles, the answers will sometimes be wrong, the solutions will be naive, and you have to describe what you need done in detail.

The great thing about Claude code is that (as opposed to most other tools) you can start it in a large code base and it will be able to find its way, without me manually "attaching files to context". This is very important, and overlooked in competing solutions.

I tried using aider and plandex, and none of them worked as well. After lots of fiddling I could get mediocre results. Claude Code just works, I can start it up and start DOING THINGS.

It does best with simple repetitive tasks: add another command line option similar to others, add an API interface to functions similar to other examples, etc.

In other words, I'd give it a serious thumbs up: I'd rather work with this than a junior intern, and I have hope for improvement in models in the future.

_1tem•5mo ago

Claude Code is the first AI coding tool that actually worked for me on a small established Laravel codebase in production. It builds full stack features for me requiring only minor tweaks and guidance (and starting all over with new prompts). However, after a while I switched to Cursor Agent just because the IDE integration makes the workflow a little more convenient (especially the ability to roll back to previous checkpoints).

light_hue_1•5mo ago

This isn't flat pricing. It's exactly the same API credits but you prepay for the month and lose anything you don't use.

Whether it turns out to be cheaper depends on your usage.

I thought Claude Code was absurdly expensive and not at all more capable than something like chatgpt combined with copilot.

esha_manideep•5mo ago

Claude's limits are so vague - its not clear if buying Claude Max is cheaper than just using the API. Has anyone benchmarked this?

varispeed•5mo ago

Sounds like they are great fans of Numberwang.

dham•5mo ago

Both Anthropic and OpenAI don't have Linux desktop clients (to use MCP), so yea I'll skip.

turnsout•5mo ago

Claude Code runs in the terminal

varispeed•5mo ago

They could make $1000 a month version that runs on tape.

auggierose•5mo ago

But at least Claude Desktop runs on my Intel Mac.

dham•5mo ago

I'm sorry

- Apple

auggierose•5mo ago

It's more like:

    I'm sorry

    - Open AI

pier25•5mo ago

$200/month?

Do people really get that much value from these tools?

I use Github's Copilot for $10 and I'm somewhat happy for what I get... but paying 10x or 20x that just seems insane.

lkbm•5mo ago

If your employer spends $20k a month on you (salary + everything else), $200 a month breaks even at around a 1% boost in productivity.

pier25•5mo ago

Maybe if you're working in FAANG...

0x6c6f6c•5mo ago

Lots of jobs where employers pay that much per head, not just FAANG. Honestly FAANG is probably spending double that for senior+ level engineers.

conradkay•5mo ago

Average US salary for SWE is 10-12k/month. Fully-loaded cost (what they spend) is 1.5-2x salary so that's not an unrealistic number

pier25•5mo ago

So you're arguing about the top 10-20% earners in the US?

Also the world is much bigger than the US.

vel0city•5mo ago

The point is you don't have to have FAANG salaries to hit $20k/mo in cost to your employer.

Tons of software developer jobs in the US for non-FAANG tier or unicorn startup companies are >$100k and easily hit $120-150k.

Also the fourth quintile mean was like $120k in the US in 2022. So you'd be in the top 30% of earners making that kind of money, not the top 10%.

https://taxpolicycenter.org/statistics/household-income-quin...

pier25•5mo ago

> unicorn startup companies are >$100k and easily hit $120-150k.

So still way below than $240k, no?

> So you'd be in the top 30% of earners making that kind of money, not the top 10%.

Maybe you missed it but I actually wrote "10-20%".

Also in 2024 earning $100k puts you in the top 20% of the US population.

https://dqydj.com/salary-percentile-calculator/

(which is already way above even the EU for dev salaries)

vel0city•5mo ago

You dropped off the "non" part of that. It's the non-Unicorn software companies easily paying $120k for a seasoned software developer in the US.

Also, I noticed where our sources diverged. I was looking at household income. My bad.

> which is already way above even the EU for dev salaries

Maybe they're underpaid.

Either way, I was responding to the idea that only a FAANG salary would cost an employer $20k/mo. For US software developer jobs, it can easily hit that without being in FAANG-tier or unicorn startup level companies. Tons of mid-sized low-key software companies you've never heard of pay $120k+ for software devs in the US.

The median software developer in Texas makes >$130k/yr. Think that's all just Facebook and Apple and silicon valley VC funded startup software devs? Similar story in Ohio, is that a place loaded with unicorn software startups? Those median salaries in those markets probably cost their employer around $20k/mo.

https://www.ziprecruiter.com/Salaries/Senior-Software-Engine...

hagbarth•5mo ago

>So still way below than $240k, no?

No, fully loaded cost of an employee is 1.5-2x salary

drodgers•5mo ago

Yes, this product mostly only targets the top 20% of US earners. That's a lot of people, and a lot of HN readers especially.

vel0city•5mo ago

If you get an employer match on 401k/HSA, the employer pays full healthcare premium, employer sponsored life insurance benefits, unemployment insurance, employer covered disability, payroll taxes, and all the other software costs, it wouldn't even take $200k in salary to cost $20k/mo. Someone could be making like $150k and still cost the company that much.

lucyjojo•5mo ago

gentle reminder that the majority of developers do not live in the united states.

median salary for a japanese dev is ~$60k. same range for europe (swiss at ~100k, italy at ~30k for the extremes). then you go down.

Russia ~$37,000 Brazil ~$31,500 Nigeria ~$6,000 Morocco ~$11,800 Indonesia ~$13,500 and india ~$30k usd

(asked chatgpt for these numbers down there, JP and EU numbers are mostly correct though as I have first hand experience).

vel0city•5mo ago

Sure, but ~$150k isn't exactly FAANG US salaries for an experienced software dev. That's my point. Lots of people forget how much extra many employers pay for a salaried employee on top of just the take home salary. Labor is expensive in the US.

I imagine a lot of people saw $20k/mo and thought the salary clearly had to be $200k+.

Kurtz79•5mo ago

According to Wilkipedia, general average wages in Italy in 2023 were 48K, and SWE jobs are usually above average.

It would be interesting to know from where Chatgpt sourced those figures as some of them look very sketchy.

lucyjojo•5mo ago

this website put the number at 40k (and 30k entry level 50k senior)

https://www.levels.fyi/t/software-engineer/locations/italy

so there s a change it took entry level numbers. another possibility is that it took numbers from a specific region in italy (big north/south gap).

PeterStuer•5mo ago

You have to set of your cost delta against your margin, not agaist your cost. Why do devs keep repeating this faulty reasoning? Where did this emerge?

If you cost 20K a month at a 5% average margin, the required ' break even' for a $200 cost increase is 20% not 1% increased productivity.

And it gets worse as you just assumed that increased 'productivity' 100% was converted back into extra margin, which is not obvious at all.

jbm•5mo ago

To rescue a flailing project that I took over when a senior hire ghosted a customer in the middle of a project, I got the 200$ Pro package from OpenAI (which is much less usable than Claude for our purposes; there were other benefits related to my client's relationship w/ OpenAI)

In the end, I was able to rescue the code part, rebuilding a 3 month long 10 person project in 2 weeks, with another 2 weeks to implement a follow-up series of requirements. The sheer amount of discussion and code creation would have been impossible without AI, and I used the full limits I was afforded.

So to answer your question, I got my money's worth in that specific use case. That said, the previous failing effort also unearthed a ton of unspoken assumptions that I was able to leverage. Without providing those assumptions to the AI, I couldn't have produced the app they wanted. Extracting that information was like extracting teeth so I'm not sure if we would have really had a better situation if we started off with everyone having an OpenAPI Pro account.

* Those who work in enterprise know intuitively what happened next.

gitremote•5mo ago

> That said, the previous failing effort also unearthed a ton of unspoken assumptions that I was able to leverage. Without providing those assumptions to the AI, I couldn't have produced the app they wanted. Extracting that information was like extracting teeth so I'm not sure if we would have really had a better situation if we started off with everyone having an OpenAPI Pro account.

The hardest part about enterprise backend development is understanding the requirements. "Understanding" is not about reading comprehension, and "requirements" are not the written requirements somebody gives you. It's about finding out what requirements are undocumented and which parts of the requirements document is misinformation. LLMs would just dutifully try to implement the written requirements with misinformation and missing edge cases, not the actual requirements.

999900000999•5mo ago

Worth it, but I’m chilling until the next major model release.

It still double downs on non working solutions

oidar•5mo ago

It would really be helpful if Anthropic let user know the useage limits and what has been used and what is left instead of these vague X5 X20 vs Pro.

bionhoward•5mo ago

Yes, let’s all agree not to compete with something that competes with us. Galaxy brain

anonzzzies•5mo ago

My first language is not English, but is this changing the meaning of 'flat'? To me this is not flat.

designed•5mo ago

It's flat as in flat rate, similar to fixed rate, meaning a set cost per month, instead of pay-as-you-go. Hope that helps.

It's flat if you graph your spend over multiple months :)

tippytippytango•5mo ago

I prefer just paying for metered use on every request. I hope monthly fees don’t carry over from the last era of tech. It’s fine to charge consumers $10 per month. But once it’s over $50 let’s not pretend you are hoping I under utilize the service, and you want me to think I’m over utilizing it. These premium subscriptions are too much for me to pretend that math doesn’t exist.

Daviey•5mo ago

Doesn't per-call pricing reduce your usage? When I see the price of a session go to >$3, for a handful of interactions I self-limit my usage.

I'd love to have an all-you-can-eat, but $100 p/m isn't compelling enough compared to copy/paste for $20 p/m via chat.

That's not to say the value doesn't exceed $100, I just don't want to pay it.

londons_explore•5mo ago

> Doesn't per-call pricing reduce your usage?

Yes, and thats why phone contracts migrated from "$0.0X per minute" to "$X for up to 500 minutes", and finally "$X for unlimited calls".

When the service you provide has near zero marginal cost, you'd prefer the customer use it as much as possible, because then it'll provide more value to them and they'll be prepared to pay more.

tippytippytango•5mo ago

Sort of, but in a good way, if I’ve spent $15 on a problem and it’s not solved, it reminds me to stop wasting tokens and think of a better strategy. On net it makes me use less tokens, but more for efficiency. I mostly love that I don’t need to periodically do math on a subscription to see if I’m getting a good deal this month.

throwaway_0351•5mo ago

Back when I used dial-up, I experienced a lot of stress when I was connected. I felt I had to be as effective as possible, because we had to pay for every minute spent.

When I switched to DSL the stress went away, and I found myself using internet in different ways than before, because I could explore freely without time pressure.

I think this applies to Claude as well. I will probably feel more free to experiment if I don't have to worry about costs. I might do things I would never think of if I'm only focused on using it as little as possible to save money.

vidarh•5mo ago

My first use of the internet was dial-up e-mail only exchange via UUCP to a local BBS that exchanged mail every 6 hours (might have been 4), and so to be as effective as possible, I'd prepare all my e-mails including mails to the e-mail<->web gateway at CERN so I could exchange a big batch right before the time slot. Often their exchange took long enough that if I sent the messages to the CERN bot first, I'd get the response included when I downloaded the replies after they'd exchanged with their upstream. Then I had a 6 hour window to figure out what to include in the next batch...

100% with you that how you access something can add constraints and stress - in my case there while we paid per minute, the big factor was the time windows. To maximise utility you wanted to include something useful in as many of the exchanges as possible.

With Claude Code as it is now, I often clear context more often than ideal because it will drive up cost. I could probably add a lot more details to CLAUDE.md in my repos, but it'll drive up tokens as well.

Some of it I'll still do because it affects speed as well, but it'll be nice not to have to pay attention to it.

vidarh•5mo ago

It's great that there's a choice, but for me the Max plan is likely to save me money already, and I suspect my usage would increase significantly if the top-up (I have intentionally not set it to auto-top-up) didn't regularly remind me not to go nuts.

jumski•5mo ago

It is kinda sad that the information about how many tokens are included is not provided - its hard to judge versus pay-as-you-go api usage because of that

PeterStuer•5mo ago

Tbh, for these types of systems I do not like the rate limiting at all. I might go days without a need, then folowed by a day of very intense usage.

Also, the 'reputation grind' some of these systems set up where you have to climb 'usage Tiers' before being 'allowed' to use more? Just let me pay and use. I can't compare your system to my current provider before weeks of being throttled at unusable rates? This makes potentially switching to you for serious users way harder than it should be. Is that realy the outcome you want? And no, I am not willing to 'talk to sales' for running a quick feasibilty eval.

jarym•5mo ago

I cancelled my Claude subscription. I was happily using it for months - asking it the odd question or sometimes having longer discussions to talk through an idea.

Then one day I got nagged to upgrade or wait a few hours. I was pretty annoyed, I didn’t regard my usage as high and felt like a squeeze.

I cancelled my pro plan and now happily using Gemini which costs nothing. These AI companies are still finding their feet commercially!

jwr•5mo ago

> now happily using Gemini which costs nothing

…and you think this is going to last? :-)

wyre•5mo ago

Google is easily in the best position to hold competitive pricing with their LLMS. They can rely on their multi-billion dollar ad business to prop up their AI advancements, compared to OpenAI or Anthropic that only exist with heavy investment from VC.

Google will probably put 2.5 Pro behind a Google One account once it is out of preview, but I don't see a compelling reason they wouldn't keep Gemini incredibly price competitive with Claude or ChatGPT.

netdevphoenix•5mo ago

Does anyone feel like the biggest selling point of LLMs so far is basically for programmers? Feels like most of the products that look like could generate revenue are for programmers.

While you can see them as a productivity enhancing tool, in times of tight budgets, they can be useful to lay off more programmers because a single one is now way more productive than pre-LLM.

I feel that LLMs will increase the barrier to entry for newcomers while also make it easier for companies to layoff more devs as you don't need as many. All in all, I expect salaries for non FAANG devs to decrease while salaries for FAANG devs to increase slightly (given the increased value they can now make).

Any thoughts on this?

otabdeveloper4•5mo ago

LLM's don't increase programmer productivity. In fact, they actively harm it.

Programmers aren't paid for coding, they're paid for following a formal spec in a particular problem domain. (Something that LLM's can't do at all.)

Improving coding speed is a red herring and a scam.

vidarh•5mo ago

In my 30 years of software development, maybe 5 of them were in places were getting people to provide a formal spec was ever an option.

It's also irrelevant if LLM's can follow them - the way I use Claude Code is to have it get things roughly working, supply test cases showing where it fails, then review and clean up the code or go additional rounds with more test cases.

That's not much different to how I work with more junior engineers, who are slower and not all that much less error-prone, though the errors are different in character.

If you can't improve coding speed with LLM's, maybe your style of working just isn't amenable to it, or maybe you don't know the tooling well enough - for me it's sped things up significantly.

otabdeveloper4•5mo ago

You don't understand.

The fact that getting a formal spec is impossible is precisely why you need to hire a developer with a big salary and generous benefits.

The formal spec lives only in the developer's head. It's the only way.

Does an LLM coding agent provide any value here?

Hardly. It's just an excuse for the developer to waste time futzing around "coding" when what they're really paid to do is cram that ineffable but very much important formal spec into their heads.

criddell•5mo ago

> The formal spec lives only in the developer's head.

You and I have different ideas of what a formal spec is.

otabdeveloper4•5mo ago

Programming language code is a kind of formal spec.

Zambyte•5mo ago

https://en.wikipedia.org/wiki/Formal_specification

vidarh•5mo ago

Nonsense.

It works just fine to use an LLM coding agent in cases like this, but you need to be aware of what you're actually trying to do with them and be specific instead of assuming they'll magic up the spec from thin air.

firtoz•5mo ago

It's very marmite. I used to hate it when it was vscode's crappy copilot. Now with Cursor and Windsurf, after some onboarding, I find it indispensable. I have used AI for coding for 3 separate roles: - freelancer - CTO - employee

And in all 3 cases, AI has increased my productivity, and I could ship things even when I'm really sleepy or if I have very little time between things, I can send a prompt to an agent and then review things, and then when I have more time, I can clean up some of the mess.

Now my stance is really at "Whoever doesn't take advantage of it is NGMI"

You're specifically very wrong at "LLM's cannot do: following a formal spec in a particular problem domain". It does take skill to ensure that they will, though, for sure.

TLDR: Skill issue

otabdeveloper4•5mo ago

> TLDR: Skill issue

What's the skill set here? Spending four hours to massage prompts to painstakingly do what can be done manually in 15 minutes?

firtoz•5mo ago

No, it's more like knowing the strengths and weaknesses, and if the work is good, accepting, and if not good, directing in the right way. The latter may take some time to learn, for sure, but not that much, and once you know, it's faster and faster.

easyThrowaway•5mo ago

It depends on what you consider "coding".

For me it's mainly adding quick and dirty hooks to Wordpress websites from berating marketing c-suits for websites that are gonna disappear or never visited anymore in less than a few months.

For that, whatever Claude spits out is more than enough. I'm reasonably confident I'm not going to write much better code in the less-than-30-minutes I'm allowed to spend to fix whatever issue comes up.

mr_mitm•5mo ago

I don't know. The other day I wanted to display an Active Directory object to the user. The dict had around 20 keys like "distinguishedname" and the "createdat" with timestamps like 144483738. I wanted friendly display names in a sensible order and have binary values converted to human readable values.

Very easy to do, sure, but the LLM did this in one minute, recognized the context and correctly converted binary values where as this would have taken me maybe 30 minutes of looking up standards and docs and typing in friendly key names.

I also told it to create five color themes and apply them to the CSS. It worked on the first attempt and it looks good, much better than what I could have had produced by thinking of themes, picking colors and copying RGB codes back and forth. Also I'm not fluent in CSS.

Though I wasn't paid for this, it's a hobby project, which I wouldn't have started in the first place without an LLM performing the boring tedious tasks.

otabdeveloper4•5mo ago

Yes, these sorts of tasks are where LLM's are exceedingly useful.

But I was talking specifically about coding agents.

(A.k.a. spend four hours micromanaging prompts and contexts to do what can be done in 15 minutes manually.)

otabdeveloper4•5mo ago

Yes, these sorts of tasks (classification and summarizing and generally naming things) are where LLM's are exceedingly useful.

But I was talking specifically about coding agents.

(A.k.a. spend four hours micromanaging prompts and contexts to do what can be done in 15 minutes manually.)

throwaway_0351•5mo ago

I see worrying trends in my office.

Developers (often juniors) use LLM code without taking time to verify it. This leads to bugs and they can't fix it because they don't understand the code. Some senior developers also trust the tool to generate a function, and don't take the time to review it and catch the edge cases that the tool missed.

They rely on ChatGPT to answer their questions instead of taking time to read the documentation or a simple web search to see discussions on stack overflow or blogs about the subject. This may give results in the short term, but they don't actually learn to solve problems themselves. I am afraid that this will have huge negative effects on their career if the tools improve significantly.

Learning how to solve problems is an important skill. They also lose access to the deeper knowledge that enable you to see connections, complexities and flows that the current generation of tools are unable to do. By reading the documentation, blogs or discussions you are often exposed to a wider view of the subject than the laser focused answer of ChatGPT

There will be less room for "vibe coders" in the future, as these tools increasingly solve the simple things without requiring as much management. Until we reach AGI (I doubt it will happen within the next 10 years) the tools will require experienced developers to guide them for the more complex issues. Older experienced developers, and younger developers who have learned how to solve problems and have deep knowledge, will be in demand.

sramam•5mo ago

Aren't the insufficiencies of the LLMs a temporary condition?

And as with any automation, there will be a select few who will understand it's inner workings, and a vast majority that will enjoy/suffer the benefits.

jagged-chisel•5mo ago

> They rely on ChatGPT to answer their questions instead of taking time to read the documentation or a simple web search.

Documentation is not written with answers in mind. Every little project wants me to be an expert in their solution. They want to share with me the theory behind their decisions. I need an answer now.

Web search no longer provides useful information within the first few results. Instead, I get content farms who are worse than recipe pages - explaining why someone would want this information, but never providing it.

A junior isn’t going to learn from information that starts from the beginning (“if you want to make an apple pie from scratch, you must first invent the universe.”) 99.999% of them need a solution they can tweak as needed so they can begin to understand the thing.

LLMs are good at processing and restructuring information so I can ask for things the way I prefer to receive them.

Ultimately, the problem is actually all about verification.

guappa•5mo ago

Ah yes, LLM is very good at giving me information from documentation that was out of date 15 years ago instead of using the documentation from 2025.

TingPing•5mo ago

Mostly made up information in my experience.

calvinmorrison•5mo ago

it's been enormously useful for my Qt3 work though, it really understands it well.

pc86•5mo ago

Most LLMs, especially the paid tiers, will fetch updated information. This was a valid complaint perhaps 8-12 months ago.

LtWorf•5mo ago

You mean they will DOS the servers of the open source projects? That's even worse!

davidguetta•5mo ago

Funny that that comment is itself out of date for approx. 15 month ago ><

ori_b•5mo ago

> Documentation is not written with answers in mind. Every little project wants me to be an expert in their solution. They want to share with me the theory behind their decisions. I need an answer now.

I have an answer now, because I read the documentation last week.

fn-mote•5mo ago

This is kind of dismissive.

As a real example, I needed to change my editor config last month. I do this about once every 5 years. I really didn’t want to become an expert in the config system again, so I tried LLM.

Sad to report, it told me where to look but all of the exact details were wrong. Maybe someday soon, though.

pc86•5mo ago

It can be dismissive but also true.

I used to make fun of (or deride) all the "RTFM" people when I was a junior too. Why can't you just tell me how to do whatever thing I'm trying to figure out? Or point me in the right direction instead of just saying "its in the docs lol"?

Sometime in the last few years I've started doing more individual stuff, I've started reading documentation before running npm i. And honestly? All the "RTFM" people were 100% right.

Nobody here is writing code that's going to be used on a patient on the surgical table right now. You have time to read the docs and you'll be better if you do.

I'm also a hypocrite because I will often point an LLM at the root of a set of API docs and ask how to do a thing. But that's the next best thing to actually reading it yourself, I think.

saratogacx•5mo ago

I'm in total agreement, TM does wonders. Even if you don't remember all of it you get a gist of what's gong on and can find things (or read them) faster.

In Claude I put in a default prompt[1] that helps me gain context when I do resort to asking the LLM for a specific question.

[1] Your role is to provide technical advice in developing a Java application. Keep answers concise and note where there are options and where you are unsure on what direction that should be taken. Please cite any sources of information to help me deep dive on any topics that need my own analysis.

asim•5mo ago

Same could be said for every language abstraction or systems layer change. When we stopped programming kernel modules and actually found a workable interface it opened the door to so many more developers. I'm sure at the time there was skepticism because people didn't understand the internals of the kernel. That's not the point. The point is to raise the level of abstraction to open the door, increase productivity and focus on new problems.

When you see 30-50 years of change you realise this was inevitable and in every generation there's new engineers entering with limited understanding of the layers beneath. Even the code produced. Do I understand the lexers and the compilers that turn my code in to machine code or instruction sets? Heck no. Doesn't mean I shouldn't use the tools available to me now.

ajsisbdjxbx•5mo ago

No, but you can understand them if given time. And you can rely on them to be some degree of reliable approaching 100% (and when they fail it will likely be in a consistent way you can understand with sufficient time, and likely fix).

LLMs don’t have these properties. Randomness makes for a poor abstraction layer. We invent tools because humans suffer from this issue too.

oblio•5mo ago

> it opened the door to so many more developers. [...] That's not the point. The point is to raise the level of abstraction to open the door, increase productivity and focus on new problems.

There are diminishing returns. At some point, quoting Cool Hand Luke, some men you just can't (r|)teach.

agumonkey•5mo ago

So the scope of answers are single function or single class ? I have people nearby that are attempting generating whole projects, I really wonder how they will ensure anything about it beside the happy paths. Or maybe they plan to have an army of agents fuzzing and creating hotfixes 24/7 ..

pc86•5mo ago

> Or maybe they plan to have an army of agents fuzzing and creating hotfixes 24/7

There are absolutely people who plan to do exactly this. Use AI to create a half-baked, AI-led solution, and continue to use AI to tweak it. For people with sufficient capital it might actually work out halfway decent.

I've had success with greenfield AI generation but only in a very specific manner:

    1. Talk with the LLM about what you're building and have it generate a detailed technical specification. Iterate on this until you have a good, human-readable explanation of the entire application or feature.
    2. Start a completely new chat/context. If you're using something like Gemini, turn temperature down and enable external search.
    3. Have instructions¹ guiding the LLM, this might be the most important step, even moreso than #1.
    4. Create the base/blank project as its own step. Zero features or config.
    5. Copy features one at a time from the spec to the chat context OR have them as separate documents and say things like "we're creating Feature 3A.1" or whatever.
    6. Iterate on each feature until you're happy then repeat.

¹ https://www.totaltypescript.com/cursor-rules-for-better-ai-d...

jamesrr39•5mo ago

> Developers (often juniors) use LLM code without taking time to verify it. This leads to bugs and they can't fix it because they don't understand the code

Well... is this something new? Previously the trend was to copy and paste Stackoverflow answers, without understanding what it did. Perhaps with LLM code it's an incremental change but the concept is fairly familiar.

dmcgill50•5mo ago

This is just one more turtle up. In college, I took a class where they taught us how to code in Assembler. I haven't looked at Assembler until this morning and here is a summary of my 5 minutes of work.

Here's an overview of what we've done:

1. *Created Assembly Code for Apple Silicon*: We wrote ARM64 assembly code specifically for your Apple M1 Max processor running macOS, rather than x86 assembly which wouldn't work on your architecture.

2. *Explained the Compilation Process*: We covered how to compile and link the code using the `as` assembler and `ld` linker with the proper flags for macOS on ARM64.

3. *Addressed Development Environment*: We confirmed that you don't need to install a separate assembler since it comes with Apple's Command Line Tools, and provided instructions on how to verify or install these tools.

4. *Optimized the Code*: We refined the code with better alignment for potential performance improvements, though noted that for a "Hello World" program, system call overhead is the main performance factor.

5. *Used macOS-Specific Syscalls*: The assembly code uses the appropriate syscall numbers and conventions specific to macOS on ARM64 architecture (syscalls 4 for write and 1 for exit).

This gives you a basic introduction to writing assembly directly for Apple Silicon, which is quite different from traditional x86 assembly programming.

Dlemo•5mo ago

I think it's an enabler for everyone.

So many people benefit from basic things like sorting tables, searching and filtering data etc.

Things were I might just use excel or a small script, they can now use an LLM for it.

And for now, we are still in dire need for more developers and not less. But yes I can imagine that after a golden phase of 5-15 years it will start to go down to the bottom when automaisation and ai got too good / better than the avg joe.

Nonetheless a good news is also that coding LLMs enable researchee too. People who often struggle learning to code.

netdevphoenix•5mo ago

When a company lays off a chunk of the workforce because the increased productivity due to LLMs means they don't need as many people, how is it an enabler for the laid off people.

What happens when most companies do this?

During the 10s, every dev out there was screaming "everyone should learn to code and get a job coding". During the 20s, many devs are being laid off.

For a field full of self-professed smart and logic people, devs do seem to be making tons of irrational choices.

Are we in need of more devs or in need of more skilled devs? Do we necessarily need more software written? Look at npm, the world is flooding in poorly written software that is a null reference exception away from crashing.

sumedh•5mo ago

> What happens when most companies do this?

It also means it becomes easier to start new company and solve a problem for people.

netdevphoenix•5mo ago

People get laid off when money is expensive. When money is expensive, running companies is harder. Starting a new company is even harder. Without capital, all you can offer is some words, a broken demo of your v1 prototype and some sweet words. You can't start a company with just that when money is expensive.

Dlemo•5mo ago

Right now we have not enough software developers at least based on surveys.

So now LLM helps us with that.

In parallel all the changes due to AI also need more effort for now. That's what I called golden age.

After that, I can imagine fundamental change for us developers.

And at least we're I live, a lot of small companies never got the chance to properly become modern due to the good developers earning very good money somewhere else.

lugu•5mo ago

I like to think that AI is to code what digital electronic was to analog electronic: a step backward in term of efficiency and 10 steps forward in term of flexibility.

Some of us will always maintain code, but most will move higher in the stack to focus on products and their real world application.

uludag•5mo ago

> All in all, I expect salaries for non FAANG devs to decrease while salaries for FAANG devs to increase slightly (given the increased value they can now make).

I find it interesting how these sort of things are often viewed as a function of technological advancement. I would think that AI development tools would have a marginal effect on wages as opposed to things like interest rates or the ability to raise capital.

Back to the topic at hand however, assuming these tools do get better, it would seemingly greatly increase competition. Assuming these tools get better, a highly skilled team with such tools could prove to be formidable competition to longstanding companies. This would require all companies to up the ante to avoid being outcompeted, requiring even more software to be written.

A company could rest on their laurels, laying off a good portion of their employees, and leaving the rest to maintain the same work, but they run the risk of being disrupted themselves.

Alas, at the job I'm at now my team can't seem to release a rather basic feature, despite everyone being enhanced with AI: nobody seems to understand the code, all the changes seem to break something else, the code's a mess... maybe next year AI will be able to fix this.

latexr•5mo ago

> Feels like most of the products that look like could generate revenue are for programmers.

Don’t discount scamming and spreading misinformation. There’s a lot of money to be made there, specially in mass manipulation to destroy trust in governments and journalists. LLMs and image generators are a treasure trove. Even if they’re imperfect, the overwhelmingly majority of people can’t even distinguish a real image from a blatantly false one, let alone biased text.

TrueDuality•5mo ago

You'll commonly see new technologies utilized by people that have the ability to make use of that technology for their own gain. Programmers are (for the most part) the only ones that can unlock LLMs to solve very specific personal problems. There are workflow automation tools allowing non-programmers the ability to do workflows but that's only one way to utilize them and it will always be constrained by the already developed integrations and the constraints of the workflow platform.

In regards to jobs and job losses I have no idea how this is going to impact individual salaries over time in different positions, but I honestly doubt its going to do much. Language models are still pretty bad at working with large projects in a clean and effective way. Maybe that will get better, but I think this generational breakthrough of technology is slowing down a lot.

Even if they do get better, they still need direction and validation. Both of which still require some understanding of what is going on (even vibe coding works better with a skilled engineer).

I suspect there is going to be more "programmers" in the world as a result, but most of them will be producing small boutique single webpage tools and designs that are higher quality than "made by my cousin's kid" that a lot of small businesses have now. Companies > ~30 people with software engineers on staff seem to be using it as a performance enhancer rather than a work replacement tool.

There will always be shitty managers and short-sighted executives that are looking to replace their human staff with some tool, and there will be layoffs but I don't think the overall pool of jobs is going to reduce. For the same reason I don't think there is going to be significant pay adjustments but a dramatic increase in the long-tail of cheap projects that don't make much money on their own.

bko•5mo ago

I don't get why making engineers more productive would decrease their salaries. It should be the reverse.

You could argue that it makes the bar lower to be productive so the candidate pool is much greater, but you're arguing the opposite, increasing the barrier to entry.

I'm open to arguments either way and I'm undecided, but you have to have a coherent economic model.

rat9988•5mo ago

> I don't get why making engineers more productive would decrease their salaries. It should be the reverse.

You need less engineers to do the same, demand gets lower, offer remains as high.

bko•5mo ago

But they're more productive. Your assumption is there is a fixed amount of engineering work to do so you need to hire fewer programmers, which is untrue. Every organization I worked at could have invested a lot more in engineering, be-it infrastructure, analytics, automation, etc.

Even if there were a fixed amount of work to do and we're already near that max amount, salaries still wouldn't necessarily go down. Again, they're more productive. Farming used to be 90% of the workforce in the US in the early 1900s. Now farmers are more productive and they're only 2% of the workforce. Do these farmers today earn a lower salary adjusted for inflation than 100 years ago? Of course not, because they're much more productive now with tools.

Generally wages track productivity. The more productive, the higher the wage.

Another example is bank tellers. With the advent of the ATM, somehow bank teller salaries didn't drop in real terms.

Show me an example of where this played out. Someone was made much more productive through technology and their salary dropped considerably

guelo•5mo ago

In the real world experiment we're living through you're being proven wrong. Tech companies have been laying off engineers continuously for several years now and wages are down.

bko•5mo ago

Layoffs started before the rise in llms and all the tooling around coding using llms. They were never used as a justification. What happened was musk bought Twitter, cut 80% headcount and it was still up which showed you can be leaner and other tech ceos took note. That and the stock crashed as we were post COVID bubble.

genewitch•5mo ago

Layoffs track with the end of ZIRP, so that is a possible confounder.

netdevphoenix•5mo ago

> Your assumption is there is a fixed amount of engineering work to do so you need to hire fewer programmers, which is untrue. Every organization I worked at could have invested a lot more in engineering, be-it infrastructure, analytics, automation, etc.

True. Problem is investment is a long-term action (cost now, for gains later). Literally every company can benefit from investment. The key question is whether how valuable are the gains over a given time period relatively to the cost you are incurring between now and the moment the gains are actualised.

LLMs wouldn't have helped Meta/Microsoft/Google lay off less people in the last 2 years. In fact, you could argue that they would have helped lay off MORE people as with LLMs you need less people to run the company. Do you think Zuckerberg would have INCREASED expenses (that's what productivity investments are) when their stock was in freefall?

Companies can't afford to spend indefinite amounts of money at any time. If your value has been going down or is going down, increasing your current expenses will get you fired. Big problems now, require solutions now. The vast majority of the tech companies in the world chose to apply a solution now.

Maybe you are right, but a look at the tech world in the last 3 years should be telling you that your decision would have been deeply popular with the people that hold the moneybags. And at the end of the day, those are the people you don't want to anger no matter how smart you believe yourself to be.

lgiordano_notte•5mo ago

Shift feels real. LLMs don't replace devs, but they do compress the value curve. The top 10% get even more leverage, and the bottom 50% become harder to justify.

What worries me isn't layoffs but that entry-level roles become rare, and juniors stop building real intuition because the LLM handles all the hard thinking.

You get surface-level productivity but long-term skill rot.

wat10000•5mo ago

You could say the same sort of thing about compilers, or higher-level languages versus lower-level languages.

That's not to say that you're wrong. Most people who use those things don't have a very good idea of what's going on in the next layer down. But it's not new.

mjr00•5mo ago

> juniors stop building real intuition because the LLM handles all the hard thinking. You get surface-level productivity but long-term skill rot.

This was a real problem pre-LLM anyway. A popular article from 2012, How Developers Stop Learning[0], coined the term "expert beginner" for developers who displayed moderate competency at typical workflows, e.g. getting a feature to work, without a deeper understanding of lower levels, or a wider high-level view.

Ultimately most developers don't care, they want to collect a paycheck and go home. LLMs don't change this; the dev who randomly adds StackOverflow snippets to "fix" a crash without understanding the root cause was never going to gain a deeper understanding, the same way the dev who blindly copy&pastes from an LLM won't either.

[0] https://daedtech.com/how-developers-stop-learning-rise-of-th...

jebarker•5mo ago

> Ultimately most developers don't care, they want to collect a paycheck and go home. LLMs don't change this; the dev who randomly adds StackOverflow snippets to "fix" a crash without understanding the root cause was never going to gain a deeper understanding, the same way the dev who blindly copy&pastes from an LLM won't either.

I read this appraisal of what "most devs" want/care about on HN frequently. Is there actually any evidence to back this up? e.g. broad surveys where most devs say they're just in it for the paycheck and don't care about the quality of their work?

To argue against myself: modern commercial software is largely a dumpster fire, so there could well be truth to the idea!

oblio•5mo ago

> I read this appraisal of what "most devs" want/care about on HN frequently. Is there actually any evidence to back this up? e.g. broad surveys where most devs say they're just in it for the paycheck and don't care about the quality of their work?

https://en.wikipedia.org/wiki/Sturgeon%27s_law

Almost every field I've ever seen is like that. Most people don't know what they're doing and hate their jobs in every field. We managed to make even the conceptually most fulfilling jobs awful (teaching, medicine, etc).

swader999•5mo ago

I think everything will shift more towards winner takes all.

antisthenes•5mo ago

Complex technology --> Moat --> Barrier to entry --> regulatory capture --> Monopoly == Winner take all --> capital consolidation

A tale as old as time. It's a shame we can't seem to remember this lesson repeating itself over and over and over again every 20-30-50 years. Probably because the winners keep throwing billions at capitalist supply-side propaganda.

mgoetzke•5mo ago

It can backfire though.

There is some mental overhead switching projects. Meaning even if a developer is more efficient per project he wont get more money (usually less actually) while increasing mental load (more projects, more managers, more requirements, etc).

Will be interesting to watch

blitzar•5mo ago

LLMs are a solution in search of a problem.

The first problem they have gained traction on is programming auto complete, and it is useful.

Generating summaries, pretty marginal benefit (personally I find it useless). Writing emails, quicker just to type "FYI" and press send than instruct the ai. More problems that needed solving will emerge, but it will take time.

phyalow•5mo ago

Deep research has saved me weeks worth of man hours in the last couple of months…

blitzar•5mo ago

Out of curiosity, which vendor? The deep research is somewhat new to me but I am open minded.

kridsdale1•5mo ago

I use two or three at a time and then have another LLM merge and synthesize the output

phyalow•5mo ago

OpenAI on o3 and Gemini 2.5. Like the user below I use multiple providers.

player1234•5mo ago

How did you measure this?

Workaccount2•5mo ago

This is a bad take to have, because it blinds you to the reality that is happening. LLM's are auto complete for pros, but full on programmers for non-tech folk. Like when GUI's first came out, the pros laughed and balked because of how much more powerful the CLI was. But look were the world is today.

At my non-tech job, I can show you three programs written entirely by LLMs that have allowed us to forgo paid software solutions. There is still a moat, IDE's are not consumer friendly, but that is pretty solvable. It will not be long before one of the big AI houses is doing a direct code to offline desktop app IDE that your grandma could use.

mark_l_watson•5mo ago

I learned to program as a child in the 1960s (thanks Dad!) so I have some biases:

Right now there seem to be two extremely valuable LLM use cases:

1. sidekick/assistant for software developers

2. a tool to let people rapidly explore new knowledge and new ideas; unlike an encyclopedia, being able to ask questions, suggest references and get summaries, etc.

I suspect that the next $$$ valuable use case will be scientific research assistants.

EDIT: I would add that AI in k-12 education will be huge, freeing human teachers to spend more 1 on 1 time with kids while AIs will be patient teaching kids, providing extra time and material as needed.

Workaccount2•5mo ago

The most valuable LLM use case right now is allowing people who don't know how to program to get their computer to do what they want it to do.

They might not be aware of this, they don't know how to use an IDE, but the hardest part - the code writing part, is solved.

Every week Rachel in [small company] accounting is manually scanning the same column in the same structured excel documents for amounts that don't align with the master amount for that week. She then creates another excel document to properly structure these findings. Then she fills out a report to submit it.

Rachel is a paragraph prompt away from not ever having to do that again, she just hasn't been given the right nudge yet.

crowcroft•5mo ago

I have a hypothesis for this.

1. Developers are building these tools/applications because it's far faster and easier for them to build and iterate on something that they can use and provide feedback on directly without putting a marketer, designer, process engineer in the loop.

2. The level of 'finish' required to ship these kinds of tools to devs is lower. If you're shipping an early beta of something like 'Cursor for SEO Managers' the product would need to be much more user friendly. Look at all the hacking people are doing to make MCP servers and get them to work with Cursor. Non-technical folks aren't going to make that work.

So then, once there is a convergence on 'how' to build this kind of stuff for devs. There will be a huge amount of work to go and smooth out the UX and spread equivalents out across other industries. Claude releasing remote MCPs as 'integrations' in their web ui is the first step of this IMO.

When this wave crashes across the broader SaaS/FAANG world I could imagine more demand for devs again, but you're unlikely going to ever see anything like the early 2020s ever again.

perplex•5mo ago

I've been using LLMs as learning tools rather than simply answer generators. LLMs can teach you a lot by guiding your thinking, not replacing it.

It's been valuable to engage with the suggestions and understand how they work—much like using a search engine, but more efficient and interactive.

LLMs have also been helpful in deepening my understanding of math topics. For example, I’ve been wanting to build intuition around linear algebra which for me is a slow process. By asking questions to LLM I find explanations make the underlying concepts more accessible.

For me it's about using these tools to learn more effectively.

burnte•5mo ago

> Does anyone feel like the biggest selling point of LLMs so far is basically for programmers? Feels like most of the products that look like could generate revenue are for programmers.

No, you're in a tech bubble. I'm in healthcare, and you'd think that AI note takers and summary generators were the reason LLMs were invented and the lion's share of use. I get a new pitch every day, "this product will save your providers hours every day!" They're great products, and our providers love ours, but it's not saving hours.

There's also a huge push for LLMs to work in search and data-retrieval chatbots. The push there is huge, and now Mistal just released Le Chat Enterprise for that exact market.

LLMs for code are so common because they're really easy to create. It's notepad plus chatGPT. Sure, it's actually VS Code and CoPilot, but you get the idea, it's actually not more complicated than regular chatbots.

conartist6•5mo ago

People forget that software engineers are already speculated to come in 10x and 100x variants, so the impact that one smart dedicated person could make is almost certainly not the problem and not changed at all by AI.

The fact is you could be one is the most insanely valuable and productive engineers in the planet might only write a few lines of code most days, but you'll be writing them in a programming language, OS, or kernel. Value is created by understanding direction and by theory-building, and LLMs do neither.

I built a genuinely new product by working hard as a single human while all my competitors tried to be really productive with LLMs. I'm sure their metrics are great, but at the end of the day I, a human working with my hands and brain and sharpening my OWN intelligence have created what productivity metrics cannot buy: real innovation

conartist6•5mo ago

Imagine the problem is picking a path against an unexplored desolate desert wasteland. One guide says that he's the fastest. Runs not walks, at a fork in the way always picks a path within 5 seconds. They promise you that they are the fastest guide out there by a factor of two.

You decide on a second opinion, and find an old wizened guide who says they always walk not run, never picks a path more quickly than 5 minutes, and promises you that no matter what sales pitch the other guide gives they can get you across the desert in half the time and half the risk to your life.

Both can't be true. Who do you believe and why?

hamilyon2•5mo ago

It is a bit like incandescent light was early selling point of electricity.

Stable odourless on-demand light was in short supply, so it helped to jump-start a new industry and network.

The real range of possible uses is near endless, for tech available today. It is just a coincidence that coding is in short supply today.

ChrisLTD•5mo ago

> I expect salaries for non FAANG devs to decrease while salaries for FAANG devs to increase slightly (given the increased value they can now make).

Are you implying that non-FAANG devs aren't able to do more with LLMs?

avgDev•5mo ago

I'm non-FAANG and I'm so much more productive now. I am a fullstack dev, I use them for help with emails to non tech individuals, analyzing small datasets, code review, code examples.....it is wild how much faster I can develop these days. My job is actually more secure because I can do more, and OWN more mission critical software, vs outsourcing it.

miragecraft•5mo ago

My father pays for ChatGPT and it’s his personal consultant/assistant for everything - from troubleshooting appliance repair, to finding the correct part to buy, to guiding him step by step to track down lost luggage and drafting the email to airline asking for compensation (and got it).

It does everything for him and it gives him results.

So no, I don’t think it’s most useful for programmers, in fact I feel people who are not very techy and not good at Googling for solutions benefit the most as chatGPT (and LLM in general) will hand hold them through every problem they have in life, and is always patient and understanding.

LouisSayers•5mo ago

As someone that's happily on the Pro plan (I got a deal at $17 per month) I'm a bit confused seeing people pay $100+ per month ... like what benefits are you getting over the cheaper plan?

When coding with Claude I cherry pick code context, examples etc to provide for tasks so I'm curious to hear what other's workflows are like and what benefits you feel you get using Claude Code or the more expensive plans?

I also haven't run into limits for quite some time now.

psankar•5mo ago

I wish these tools like Cursor, Windsurf etc. provide free option for working with open source projects, after all they trained their models via open source code.

jjice•5mo ago

Tangential, but I don't want to use LLMs for writing code because it's one of the things I enjoy the most in life, but it's feeling that I'm going to need to have to to get ready for the next years of my career. I've had some experiences with Claude that have seriously impressed me, but it takes away the fun that I've found in my jobs since I was in middle school writing small programs.

Does anyone have advice for maintaining this feeling but also going with the flow and using LLMs to be more productive (since it feels like it'll be required in the next few years at many jobs)? Do I just have to accept that work will become work and I'll have to get my fix through hobby projects?

swader999•5mo ago

Have a side project where you control the entire experience and let your main job bend to the beast.

ljm•5mo ago

I think there will always be jobs out there that don't demand you write code with an LLM, just the same that most jobs don't demand you use vim or emacs or LSP-based autocomplete as part of your workflow.

You don't have to go with the flow. I took a step back from AI tech because a lot of startups in that field come with extra cultural baggage that doesn't sit well with me.

GuardianCaveman•5mo ago

Are you using it for other things? I think you can write code without it but it’s so good for research and stack overflow replacement.

Last night I used it to look through some project in an open source code base in a language I’m not familiar with to get a report on how that project works. I wanted to know what are its capabilities and integrations with these other specialized tools, because the documentation is so limited. It saved me time and didn’t help me write code. Beyond that it’s good for asking really stupid questions about complex topics that you’d get roasted on for stack overflow.

wxrs•5mo ago

How can you be sure that the report is accurate? Did you verify that the project actually has the capabilities & integrates with the other specialized tools? I've seen many instances where the model either left out important information or came up with totally new stuff that got buried in the rest (mostly true) of the answer.

_fat_santa•5mo ago

> I don't want to use LLMs for writing code because it's one of the things I enjoy the most in life

I think LLM's are really good for the "drudge work" when you're coding. I always say it's excellent for things where the actual task is easy but the bottleneck is how fast you can type.

As an example I had a project where I was previously extracting all strings in the UI into an object. For a number of reasons I wanted to move away from this but this codebase is well over 50k LOC and I have probably 5k lines of strings. Doing this manually would have been very tedious and would have taken me quite some time so I leveraged AI to help me and managed to refactor all the strings in my app in a little over an hour.

gtirloni•5mo ago

Exactly. This is the use case that companies should be improving. The AGI marketing hype is a distraction.

McScrooge•5mo ago

Treat them as resources for remembering/exploring code libraries and documentation. For example, I needed to import some JSON files as structs into Unreal Engine. Gemini helped me to quickly identify the classes UE has for working with JSON.

davepeck•5mo ago

Do you use compilers? Linker loaders? Web bundlers? Linters and formatters? Code gen for or from schema? Image editors? Memory safe or garbage collected languages?

Then you already use levers to build code.

LLMs are a new kind of tool. They’re weird and probabilistic and only sometimes useful. We don’t yet know quite how and when to use them. But they seem like one more lever for us to wield.

gtirloni•5mo ago

I think the probabilistic nature is a huge divider. It requires a complete different way of working and it's understandable that people have trouble switching from one to the other (easier for experienced devs, in my experience, but still makes you switch to code reviewer mode too often).

owebmaster•5mo ago

> Does anyone have advice for maintaining this feeling but also going with the flow and using LLMs to be more productive

Coding with LLMs brought me so much more joy coding. Not always, but it is getting better. Sometimes is quite frustrating, but when you have some good idea, explain it well and get the model to generate the code the way you would code or even better and you can use it to build new things faster, that's magical. Many devs are having this experience, some earlier, some now, some later. But for sure I would not say that using LLMs to code made it less enjoyable.

wofo•5mo ago

I faced a related dilemma when I finished my CS degree: to work as a full-stack dev or to work on more foundational technology (and actually use what I learned in my degree). My experience is that the "foundational technology" area is more "research-oriented", which means you get to work on projects where LLM's don't help that much: writing code in languages that have little data in the LLM's training corpus, coming up with performance benchmarking approaches unique for your application, improving a workload's throughput with insights derived from your benchmarking results and your ingenuity, etc. Had I gone down the full-stack path, I think I'd be worried now.

Arubis•5mo ago

Been on this about a week at the $100/mo mark. I’m not hitting quota limits (I’d swap to the $200/mo in a heartbeat if I were), using Claude Code on multiple tasks simultaneously without abandon. Prior to the flat plan I was spending nearly $1k/mo on tokens. That figure was justifiable but painful. Paying a tenth of it is lovely.

asymmetric•5mo ago

Has anyone tried Claude Code with Vertex/Bedrock instead? How does it compare in terms of pricing?

saralily•5mo ago

Isn't it exactly the same on Bedrock as in the API?

asymmetric•5mo ago

Yeah you're right, the price is exactly the same, it seems.

Fairburn•5mo ago

They could charge me anything. But unless they knock off the msg rate limiting, won't touch them.

replwoacause•5mo ago

Anthropic's pricing is crazy IMO. Still haven't tried the Code product because of it.

hendersoon•5mo ago

Little tip, you can get pretty close to this with a $20 Claude subscription using the desktop commander MCP.

koolala•5mo ago

Aider using the web interface of Gemini Pro seems like the cheapest way to get flat pricing. All it needs is a bookmarklet to automate it.

bicepjai•5mo ago

I just burned 200$ over a weekend trying to finish my pet project with cline and Claude. But soon I realized, this is not sustainable and I was half there. I tried Gemini pro 2.5 and with in 30 mins I hit 30$ and had to back off. Now trying deepseek. Now trying to see if I can have local models in cline act mode. Let me know if anyone had success with this. My project is in rust

Evaluating the Infinity Cache in AMD Strix Halo

rlsw – Raylib software OpenGL renderer in less than 5k LOC

The Gypsy Life of Robert Louis Stevenson

Show HN: Modshim – A new alternative to monkey-patching in Python

LLMs can get "brain rot"

Replacing a $3000/mo Heroku bill with a $55/mo server

OpenBSD 7.8

Ask HN: Our AWS account got compromised after their outage

Wikipedia says traffic is falling due to AI search summaries and social video

Neural audio codecs: how to get audio into LLMs

Power over Ethernet (PoE) basics and beyond

The Hidden Engineering of Niagara Falls

NASA chief suggests SpaceX may be booted from moon mission

Mathematicians have found a hidden 'reset button' for undoing rotation

ChatGPT Atlas

Getting DeepSeek-OCR working on an Nvidia Spark via brute force with Claude Code

Erowid - Documenting the Complex Relationship Between Humans and Psychoactives

Build your own database

Principles and Methodologies for Serial Performance Optimization

Researchers complete first human trial on viability of enteral ventilation

If you'd built a "tool" that stupid, why would you advertise the fact?

What do we do if SETI is successful?

The death of thread per core

Minds, brains, and programs (1980) [pdf]

Understanding conflict resolution and avoidance in PostgreSQL: a complete guide

KDE Connect: Enabling communication between all your devices

Foreign hackers breached a US nuclear weapons plant via SharePoint flaws

Show HN: Katakate – Dozens of VMs per node for safe code exec

Doomsday scoreboard

Flexport Is Hiring SDRs in Chicago

Evaluating the Infinity Cache in AMD Strix Halo

rlsw – Raylib software OpenGL renderer in less than 5k LOC

The Gypsy Life of Robert Louis Stevenson

Show HN: Modshim – A new alternative to monkey-patching in Python

LLMs can get "brain rot"

Replacing a $3000/mo Heroku bill with a $55/mo server

OpenBSD 7.8

Ask HN: Our AWS account got compromised after their outage

Wikipedia says traffic is falling due to AI search summaries and social video

Neural audio codecs: how to get audio into LLMs

Power over Ethernet (PoE) basics and beyond

The Hidden Engineering of Niagara Falls

NASA chief suggests SpaceX may be booted from moon mission

Mathematicians have found a hidden 'reset button' for undoing rotation

ChatGPT Atlas

Getting DeepSeek-OCR working on an Nvidia Spark via brute force with Claude Code

Erowid - Documenting the Complex Relationship Between Humans and Psychoactives

Build your own database

Principles and Methodologies for Serial Performance Optimization

Researchers complete first human trial on viability of enteral ventilation

If you'd built a "tool" that stupid, why would you advertise the fact?

What do we do if SETI is successful?

The death of thread per core

Minds, brains, and programs (1980) [pdf]

Understanding conflict resolution and avoidance in PostgreSQL: a complete guide

KDE Connect: Enabling communication between all your devices

Foreign hackers breached a US nuclear weapons plant via SharePoint flaws

Show HN: Katakate – Dozens of VMs per node for safe code exec

Doomsday scoreboard

Flexport Is Hiring SDRs in Chicago

A flat pricing subscription for Claude Code

Comments