you can skim the transcript but some personal highlights:
- anthropic employees, with unlimited claude, average to $6/day of usage
- headless claude code as a "linux" utility that you use everywhere in CI is pretty compelling
- claude code as a user extensible platform
- future roadmap of claude code: sandboxing, branching, planning
- sonnet 3.7 as a persistent, agentic model
For a tool that radically increases productivity (say 2x), I think it could still make sense for a VC funded startup or an established company (even $100/day or $36k/year is still a lot less than hiring another developer). But for a side project or bootstrap effort, $36k/year obviously significantly increases cash expenses. $100/month does not, however.
So, I'm going to go back and upgrade to Max and try it again. If that keeps my costs to $100/month, thats a really different value proposition.
Edit:
Found the answer to my own questions
> Send approximately 50-200 prompts with Claude Code every 5 hours[1]
Damn. That's a really good deal
[1] https://support.anthropic.com/en/articles/11145838-using-cla...
I was listening to this podcast yesterday and I also did a double take when I heard the $6 per day number.
From the link:
"Apparently, there are some engineers inside of Anthropic that have spent >$1,000 in one day!"
The question is what is the P50, P75, and P95 spend per employee?
† simonw, gwern
i do feel like SNR * quantity could be higher, but its still a challenge to even keep it where it is today. my work life balance/stress levels aren't the best and everyone expects everything from me.
The "golden" end state of coding agents is that you give it a Feature Request (EG Jira ticket), and it gives you a PR to review and give feedback on. Cursor, windsurf, etc, are dead ends in that sense as they are local editors, and can not be in CI.
If you are tooling your codebase for optimal AI usage (Rules, MCP, etc), you should target a technology that can bridge the gap to headless usage. The fact Claude Code can trivially be used as part of automation through the tools means it's now the default way I thinking about coding agents (Codex, the npm package, is the same).
Disclaimer, I focus on helping companies tool their codebases for optimal agent usage, so I might have a bias here to easily configurable tools.
I see your point but in the other hand how depressing to be left only with the most soul crushing part of software entering - the Jira ticket.
I understand the craft of code itself is what some people love though!
In fact, that's the main reason I like developing quick prototypes and small projects with LLMs. I use them less to write code for me, and more to cut through the bullshit "research" phase of figuring out what code to write, which libraries to pick, what steps and auxiliary work I'm missing in my concept, etc.
> Can LLMs come up with the 1% ideas that breakthrough? Paired with great execution
It's more like 0.01%, and it's not the target anyway. The world doesn't run on breakthroughs and great execution, it runs on the 99.99% of the so-so work and incremental refinement.
We will drop the narrow-minded deadweight that can only collect naive requirements, and the coding side that can only implement unambiguous tickets.
That's what I want and look forward one day
I hate using voice for anything. I hate getting voice messages, I hate creating them. I get cold sweats just thinking about having to direct 10 AI Agents via voice. Just give me a keyboard and a bunch of screens, thanks.
Brainstorming/whiteboarding, 1:1s or performance feedback, team socialization, working through something very difficult (e.g. pair debugging): in-person or video
Incidents, asking for quick help/pointers, small quick questions, social groups, intra-team updates: IM
Bigger design documents and their feedback, trickier questions or debugging that isn't urgent, sharing cool/interesting things, inter-team updates: Email
It's hard for me to believe that there are psychopaths among us who prefer call on the phone, slack huddle or even organize meetings instead of just calmly writing messages on IM over coffee.
And don’t get me started on video vs text for learning purely non-physical stuff like programming…
sending audio = fast
Though I'll gladly call it various foul names when it's refusing to do what I expected it to do.
Yeah, I think I’d rather click and type than talk, all day.
Some of us who’ve been in this game for a while consider having healthy hands to be a nice break between episodes of RSI, PT, etc. YMMV of course but your muscle stamina won’t be the problem, it’s your tendons and eventually your joints.
I've done more typing than speaking for over 40 years now, and I've never had any carpel tunnel or joint problems with my hands (my feet on the other hand.. hoo boy!) and I've always used a standard layout flat QWERTY keyboard.. but I never bend my hands into that unnatural "home row" position.
I type >60wpm using what 40 years ago was "hunt and peck" and evolved over brute force usage into "my hands know where they keys are, I am right handed so my right hand monopolizes 2/3 of the keyboard, both hands know where every key is so either one can take over the keyboard if the other is unavailable (holding food, holding microphone for when I do do voice work, using mouse, etc)".
But as a result my hands also evolved this bespoke typing strategy which naturally avoids uncomfortable poses and uncomfortable repetition.
1. Superwhisper - https://superwhisper.com
2. Macwhisper - https://goodsnooze.gumroad.com/l/macwhisper
3. Carelesswhisper - https://carelesswhisper.app
I’m very sorry for you if this is literally true. I would urge you to seek medical help, as this is not normal at all.
I think I'm marginally faster using speech to text than using a predictive text touch keyboard.
But it makes enough mistakes that it's only very slightly faster, and I have a very mild accent. I expect for anyone with a strong accent it's a non starter.
On a real keyboard where I can touch type, it's much slower to use voice. The tooling will have to improve massively before it's going to be better to work by speaking to a laptop.
But I can appreciate that sitting down in front of a keyboard and going at it with low typing speed seems unnatural and frustrating for probably the majority of people. To me, in front of a keyboard is a fairly natural state. Somebody growing up 15 years before (got by without PCs in their early years) or after me (got by with a smartphone) probably doesn't find it as natural.
That's exactly what everyone is hoping for. Well, everyone except software engineers, of course
I don't understand the pleasure of putting people out of work and the pain on people's lives and careers but I guess that's just me.
If the cost of writing software goes down, demand for it will presumably go up...
All those skills can be applied to engineering as well. What makes Fabrice Bellard great? Its not just technical skill I think.
I think some of the most successful people will be a subset of engineers but also Steve Jobs types and artists
Having all this in one person is super valuable because you lose a lot of speed and fidelity in information exchange between brains. I wouldn't be surprised if someone could hit like 30-50 kloc/day within a few years. I can hit 5-10kloc/day doing this stuff depending on a lot of factors, and that's driving ~2 agents at a time mostly. Imagine driving 20.
Here is an unwanted advice, it is not going to be the new hotshot developer, rather hotshot technical and solution architects.
The dream of CASE tools is finally here, pump the requirements into the software factory (aks instruction files), and the replicator handles the rest.
Isn’t that effectively the promise of the most recently released OpenAI codex?
From the reviews I’ve been able to find so far though, quality of output is ehh.
I bias a bit to wanting the agent to be a pluggable component into a flow I own, rather than a platform in a box.
It'll be interesting to see where the different value props/use cases of a Delvin/v0 vs a Codex Cloud vs Claude Code/Codex CLI vs Cursor land.
Put the Aider CLI into a GitHub action that's triggered by an issue creation and you're good to go.
But it's 100% the same class of tool and the awesome part of the unixy model is hopefully agents can be substituted in for each other in your pipeline for whichever one is better for the usecase, just like models are interoperable.
The main difference is I interact with Claude Code only through conversation. Aider felt much more like I was talking to two different tools, the model and Aider. For example, constantly having to add files and parse the less than ideal console output compared to how Claude code handles user feedback.
I personally see that as a plus, because other tools are lacking on the tool side. Aider seems to have solid "traditional" engineering behind its tooling.
"constantly having to add files"
That's fair. However, Aider automatically adds files that trigger it via comments and it asks to add the files that are mentioned in the conversation.
"parse the less than ideal console output"
That's fair too. Still, the models aren't there yet, so I value tools that don't hide the potential crap that thee models produce 20-30% of the time.
Like Anthropic and most big tech companies, they don't want to show off the best until they need to. They used to stockpile some cool features, and they have time to think about their strategy. But now I feel like they are in a rush to show off everything and I'm worried whether the management has time to think about the big picture.
No, any team.
Management only appears to have power because ownership wants workers to point their ire to them instead of at the real power.
This is I guess what happens when you follow capitalism to its logical conclusion. It's exactly what you expect from some reinforcement learning algorithm that only knows how to climb a gradient to maximize a singular reward. The concept of commerce has become the proverbial rat in the skinner box. It has figured out how to mainline the heroin drip if it just holds down the shock button and rewires its brain to get off on the pain. Sure it's an artificial high and hurts like hell to achieve it, but what else is there to live for? We made the line going up mean everything, so that's all that matters now. Doesn't matter if we don't want it, they want it. So that's what it's going to be.
This.
I am amazed that people usually are blind to this trajectory.
The owner (human) would say "build a company, make me a billion dollars" and that would be the only valuable input needed from him/her. Everything else would be derived & executed by the AI swarm, while owner plays video games (or generally enjoy the product of other people's AI-labor) 100% of the time.
I'd argue GPT4 (2022) was already AGI. It could output anything you (or Tim Cook, or any other smart guy) could possibly output given the relevant context. The reason it doesn't right now is we are not passing in all your life's context. If we achieve this, a human CEO has no edge over an AI CEO.
People are figuring this problem out very quickly, therefore the explosion of agentic capabilities happening right now even though the base model fundamentally does the same stuff as GPT4.
Of all the professions that are at the risk of being downsized, I think lawyers are up there. We used to consult our lawyers so frequently about things big and small. We have now completely removed the small stuff from that equation. And most of our stuff is small. There is very little of the big stuff and I think LLMs aren't too far from taking care of that as well.
And there's no better example of hourly work than lawyers.
Personally, I've always disliked the model of billing by the hour because it incentivizes the wrong things, but it is easier to get clients to justify these costs (because they're used to thinking in that framework).
I'd rather take on the risk and find ways to do more efficient work. It's actually FUN to do things that way. And nowadays, this is where AI can benefit in that framework the most.
But I know I'm probably in the minority.
Yes, I want it. It would 100* our GDP, and make people significantly more independent.
Mix it with open source unbiased AI, living on a land large enough to feed you and your family, and cheap energy, and utopia is here.
The real threats to our profession are things like climate change, extreme wealth concentration, political instability, cultural regression and so on. It's the stuff that software stands on that one should worry about, not the stuff that it builds towards.
The current SOTA models can do some impressive things, in certain domains. But running a business is way more than generating JavaScript.
The way I see it, only some jobs will be impacted by generative AI in the near term. Not replaced, augmented.
*I have a few more safety/scalability changes to make but expecting public launch in a few weeks!
I'm not saying "ban propietary LLMs", I'm saying: hackers (the ones that used to read sites like this) should have as their main tools free and open source ones.
Yes, because hardware and electricity aren't free.
I literally DO pay for every command. I just don't get an itemized bill so there's no transparency about it. Instead, I made some lump-sum hardware payment which is amortized over the total usage I get out of it, plus some marginal increase in my monthly electric bill when I use it.
I was doing this with Cursor and MCPs. Got about a full day of this before I was rate limited and dropped to the slowest, dumbest model. I’ve done it with Claude too and quickly exhaust my rate limits. And the PRs are only “good to go” about 25% of the time, and it’s often faster to just do it right than find out where the AI screwed up.
The only possible way for this to be a successful offering is if we have just now reached a plateau of model effectiveness and all foundation models will now trend towards having almost identical performance and capabilities, with integrators choosing based on small niceties, like having a familiar SDK.
At this point Claude Code is a software differentiator in the agent coding space.
I am building things related to AI code assistants - we were hacking ways to integrate Claude Code - it was the first thing we wanted to build around.
It's too early to care about lock in.
Need the best, will only build around the best.
Honestly though, CLI tools for accessing LLMs (including piping content in and out of them) is such a clearly good idea I'm glad to see more tools implementing the pattern.
pip install openai
export OPENAI_API_KEY="..."
openai api completions.create \
--model gpt-4.1-mini \
--prompt "tell a joke about a fish"
Then you can instrument through metaprogramming. For instance, an alert system could be:
"If the threshold goes over 1.0, contact the on-call person through their preferred method" - which may work ... maybe.
Or:
if any( "check_condition {x}", condition_set ): find_person("on call", right now).contact("preferred")
... the point is to divide everything up into small one-shots, parallelize them, use it as glue/api. Then you get composability. If you can get a framework for coroutines going then it's real game on. The final step is "needs based pulling" which is an inversion of mcp - contextual streams as event based sub-systems.
Things are still too slow for this to be not painful but that won't be the case forever.
Currently everything is linear. Doesn't have to be ... really doesn't.
Can somebody please tell me what software product or service doesn’t compete with general intelligence?
Imagine selling intelligence with a legal term that, under strict interpretation, says you’re not allowed to use it for anything.
Is it so vague it’s unenforceable?
How do we own the output if we can’t use it to compete with a general intelligence?
Is it just a “lol nerd no one cares about the legal terms” thing? If no one cares then why would they have a blanket prohibition on using the service ?
We’re supposed to accept liability to lose a lawsuit just to accept their slop? So many questions
[0] https://aider.chat/docs/scripting.html
[1] https://aider.chat/docs/recordings/tree-sitter-language-pack...
Add a file to your repo and you can talk to any model via issues.
I don't really want it committing and stuff, i mostly like the UX of Claude Code. Thoughts?
[0]: https://docs.anthropic.com/en/docs/claude-code/github-action...
As it only accepts an API key as far as I can tell.
This SDK currently supports only command line usage. Isn't that just what we already had?
I don't understand what's actually new here. What am I missing?
However I feel what we really need is to have an open source version of it where you can pass any model and also you can compare different models answers.
(Aider and other alternatives really doesn't feel as good to use as Claude Code)
I know this is not what anthropic would want to do as it removes their moat, but as a consumer I just want the best model and not be tied to an ecosystem. (Which I imagine is the largest fear of LLM model providers)
What does Claude Code do better than Aider?
It's still under development but looks promising.
(I am not affiliated with this project, just a user.)
andrewstuart•1mo ago
barefootford•1mo ago
andrewstuart•1mo ago
Works for a reasonable chunk of files say 5 to 10 that aren’t too big.
No doubt they’ll get to better file access.
Anyhow I’m quite happy to do the copy and paste because Geminis coding and debugging capability is far better than Claude.
danenania•1mo ago
The default planning/coding models are still Sonnet 3.7 for context size under 200k, but you can switch to Gemini with `\set-model gemini-preview`.
1 - https://github.com/plandex-ai/plandex
dimitri-vs•1mo ago
I really like the idea of Claude Code but its rare that I fully spec out a feature on my first request and I can't see how it can be used for frontend features that require a lot of browser-centric iteration/debugging to get right.
termin3•1mo ago
mickeyp•1mo ago
If you (or anyone else reading this) wants to try out the upcoming beta give me a ping. (see profile.)
Sajarin•1mo ago
I think Bard (lol) and Gemini got a late start and so lots of folks dismissed it but I feel like they've fully caught up. Definitely excited to see what Gemini 3 vs GPT-5 vs Claude 4 looks like!
Karrot_Kream•1mo ago
fallinditch•1mo ago
I suspect that I experience some performance throttling with Gemini 2.5 in my Windsurf setup because it's just not as good as anecdotal reports by others, and benchmarks.
I also seem to run up against a kind of LLM laziness sometimes when they seemingly can't be bothered to answer a challenging prompt ... a consequence of load balancing in action perhaps.
lcfcjs6•1mo ago
mbesto•1mo ago
EDIT: Specifically: https://openrouter.ai/rankings/programming?view=week
ChadMoran•1mo ago
simonw•1mo ago
andrewstuart•1mo ago
cube2222•1mo ago
Gemini 2.5 Flash on the other hand has excellent. I’ve started using it to rewrite whole files after talking the changes through with Claude, because it’s just so ridiculously fast (and dependable enough for applying already outlined changes).
ramoz•1mo ago
The two work really well with Gemini as a planner and Claude Code as an executor.
dgellow•1mo ago
andy12_•1mo ago