My experience with Claude Code after two weeks of adventures

https://sankalp.bearblog.dev/my-claude-code-experience-after-2-weeks-of-usage/

392•dejavucoder•6mo ago

Comments

wahnfrieden•6mo ago

One approach to improving CC's search is to use Repo Prompt https://repoprompt.com

dejavucoder•6mo ago

Thanks, I will check this out

voicedYoda•6mo ago

Be lovely if i could sign up for Claude using my g voice number

fuzzy2•6mo ago

Or no number even. Sucks to be missing out, but I won’t budge on this.

ctoth•6mo ago

Sometimes, you'll just have a really productive session with Claude Code doing a specific thing that maybe you need to do a lot of.

One trick I have gotten some milage out of was this: have Claude Code research Slash commands, then make a slash command to turn the previous conversation into a slash command.

That was cool and great! But then, of course you inevitably will interrupt it and need to do stuff to correct it, or to make a change or "not like that!" or "use this tool" or "think harder before you try that" or "think about the big picture" ... So you do that. And then you ask it to make a command and it figures out you want a /improve-command command.

So now you have primitives to build on!

Here are my current iterations of these commands (not saying they are optimal!)

https://github.com/ctoth/slashcommands/blob/master/make-comm...

https://github.com/ctoth/slashcommands/blob/master/improve-c...

whatever1•6mo ago

I find amazing all the effort that people put trying to program a non deterministic black box. True courage.

ctoth•6mo ago

Oh do let me tell you how much effort I put into tending my non-deterministic garden or relationships or hell even the contractors I am using to renovate my house!

A few small markdown documents and putting in the time to understand something interesting hardly seems a steep price!

blub•6mo ago

The contractors working on my house sometimes paint a room bright pink for no particular reason.

When I point that out, they profusely apologize and say that of course the walls must be white and wonder why they even got the idea of making them pink in the first place.

Odd, but nice fellows otherwise. It feels like they’re 10x more productive than other contractors.

ctoth•6mo ago

I asked my contractor to install a door over the back stairs opening outward, came back, and it was installed opening inward. He told me he tried to figure out a way he could do it in-code, but there wasn't one, so that's what he had to do. I was slightly miffed he didn't consult me first, but he did the pragmatic thing.

This actually happened to me Monday.

But sure, humans are deterministic clockwork mechanisms!

Are you now going to tell me how I got a bad contractor? Because that sure would sound a lot like "you're using the model wrong"

roywiggins•6mo ago

The big difference is that when questioned he didn't profusely apologize and immediately get to work undoing it, but instead gave you a reason that was probably causally related to the decision he made and not a retroactive justification made up on the spot.

johnfn•6mo ago

You're giving people a lot of credit. Often when I ask for justification for stupid decisions I get dumb rationale that makes no sense, or no rationale at all, or something very clearly made up on the spot.

tick_tock_tick•6mo ago

I mean I get your trying to make a joke but a contractors fucking up paint and trying to gaslight you into believing it's the one you signed off on isn't that rare.

savanaly•6mo ago

Also, they paint the whole room for $2.41! When you add that critical detail the analogy comes into focus.

People aren't excited about AI coding because it's so much better than human coders. They're excited because it's within spitting distance while rounding down to free.

simlevesque•6mo ago

Our brains are non deterministic black box. We just don't like to admit it.

oldenlessons•6mo ago

"This system has severe limitations. But we should rely more on it because this other system is also limited."

For several decades we've developed and use computers because they can be very precise and deterministic.

johnfn•6mo ago

You might also find it amazing that people work with colleagues and give them feedback!

aniforprez•6mo ago

Colleagues take that feedback and improve. It seems a large number of people here seem to find these kind of improvements in other humans useless in the hypothetical case of them leaving but I occasionally look at my mentee's LinkedIn and feel a sense of pride at having contributed. I also feel joy when they appreciate my efforts.

I genuinely don't understand how often people compare AI to junior developers.

iambateman•6mo ago

Claude Code is hard to describe. It’s almost like I changed jobs when I started using it. I’ve been all-in with Claude as a workflow tool, but this is literally steroids.

If you haven’t tried it, I can’t recommend it enough. It’s the first time it really does feel like working with a junior engineer to me.

dejavucoder•6mo ago

Almost feels like a game as you level up!

arealaccount•6mo ago

Weirdly enough I have the opposite experience where it will take several minutes to do something, then I go in and debug for a while because the app has become fubar, then finally realize it did the whole thing incorrectly and throw it all away.

And I reach for Claude quite a bit because if it worked as well for me like everyone here says, that would be amazing.

But at best it’ll get a bunch of boilerplate done after some manual debugging, at worst I spend an hour and some amount of tokens on a total dead end

jm4•6mo ago

You can tell Claude to verify its work. I’m using it for data analysis tasks and I always have it check the raw data for accuracy. It was a whole different ballgame when I started doing that.

Clear instructions go a long way, asking it to review work, asking it to debug problems, etc. definitely helps.

vunderba•6mo ago

> You can tell Claude to verify its work

Definitely - with ONE pretty big callout. This only works when a clear and quantifiable rubric for verification can be expressed. Case in point, I put Claude Code to work on a simple react website that needed a "Refresh button" and walked away. When I came back, the button was there, and it had used a combination of MCP playwright + screenshots to roughly verify it was working.

The problem was that it decided to "draw" a circular arrow refresh icon and the arrow at the end of the semicircle was facing towards the circle centroid. Anyone (even a layman) would take one look at it and realize it looked ridiculous, but Claude couldn't tell even when I took the time to manually paste a screenshot asking if it saw any issues.

While it would also be unreasonable to expect a junior engineer to hand-write the coordinates for a refresh icon in SVG - they would never even attempt to do that in the first place realizing it would be far simpler to find one from Lucide, Font Awesome, emojis, etc.

DrewADesign•6mo ago

In general, using your own symbol forms for interactions rather than taking advantage of people’s existing mental models is a bad idea. Even straying from known libraries is shaky unless you’re a competent enough designer to understand what specific parts of a visual symbol signify that specific idea/action, and to whom. From a usability perspective, you’re much better off not using a symbol at all than using the wrong one.

yakz•6mo ago

I second this and would add that you really need an automated way to do it. For coding, automated test suites go a long way toward catching boneheaded edits. It will understand the error messages from the failed tests and fix the mistakes more or less by itself.

But for other tasks like generating reports, I ask it to write little tools to reformat data with a schema definition, perform calculations, or do other things that are fairly easy to then double-check with tests that produce errors that it can work with. Having it "do math in its head" is just begging for disaster. But, it can easily write a tool to do it correctly.

bigiain•6mo ago

> Clear instructions go a long way, asking it to review work, asking it to debug problems, etc. definitely helps.

That's exactly what I learned. In the early 2000's, from three expensive failed development outsourcing projects.

tcdent•6mo ago

This has a lot to do with how you structure your codebase; if you have repeatable patterns that make conventions obvious, it will follow them for the most part.

When it drops in something hacky, I use that to verify the functionality is correct and then prompt a refactor to make it follow better conventions.

hnaccount_rng•6mo ago

Yeah that is kind of my experience as well. And - according to the friend who highly recommended it - I gave it a task that is "easily within its capabilities". Since I don't think I'm being gaslighted, I suspect it's me using it wrong. But I really can't figure out why. And I'm on my third attempt now..

0x_rs•6mo ago

Some great advice I've found that seems to work very well: ask it to keep a succinct journal of all the issues and roadblocks found during the project development, and what was done to resolve or circumvent them. As for avoiding bloating the code base with scatterbrained changes, having a tidy architecture with good separation of concerns helps leading it into working solutions, but you need to actively guide it. For someone that enjoys problem-solving more than actually implementing them, it's very fun.

taude•6mo ago

to continue on this, I wouldn't let claude or any agent actually create a project structure, i'd guide it in the custom system prompt. and then in each of the folders continue to have specific prompts for what you expect the assets to be coded like, and common behavior, libraries, etc....

gonzo41•6mo ago

So you've invented writing out a full business logic spec again.

btw, I'm not throwing shade. I personally think upfront design through a large lumbering document is actually a good way to develop stuff. As you either do it upfront, or through endless iterations in sprints for years.

bugglebeetle•6mo ago

Yeah, my experience of working with Claude Code is that I’m actually far more conscientious about design. After using it for awhile, you get a good sense of its limits and how you need to break things down and spell thing out to overcome.

underdeserver•6mo ago

The problem with waterfall wasn't the full business spec, it was that people wrote the spec once and didn't revise it when reality pushed back.

taude•6mo ago

I spent 10 minutes writing out the business logic, you don't have to do it all at once. We're not talking about long complicated things here.

actinium226•6mo ago

> For someone that enjoys problem-solving more than actually implementing them, it's very fun.

So, is Claude just something you use for fun? Would you use it for work?

taude•6mo ago

do you create the claude.md files at several levels of your folder structure, so you can teach it how to do different things? Configuring these default system prompts is required to get it to work well.

I'd definitely watch Boris's intro video below [1]

[1] Boris introduction: https://www.youtube.com/watch?v=6eBSHbLKuN0 [2] summary of above video: https://www.nibzard.com/claude-code/

dawnerd•6mo ago

By the time you do all of that you might as well just write code by hand.

serf•6mo ago

that's really just a scale question.

Yes, I would write a 4 line bash script by myself.

But if you're trading a 200 line comprehensive claude.md document for a module that might be 20k LoC? it's a different value proposition.

darkwater•6mo ago

And how do you actually know that those 20k line of codes have no glaring bugs, or bugs that you can find yourself, or be able to understand it completely at some point?

rovr138•6mo ago

The way I do this is by still writing tests.

darkwater•6mo ago

Do tests let you understand a codebase you have not written?

koolba•6mo ago

They can. Particularly if you use them to validate your assumptions about the code.

intrasight•6mo ago

this desire to understand code will be soon be seen as rather anachronistic. What's important is that you understand your tests. Let the AI generate the code.

The spec and the test are your human contribution.

darkwater•6mo ago

I understand your point of view but I think it's too "optimistic", i.e. it will not happen soon, at least not outside AI maximalists.

chasd00•6mo ago

you're describing TDD and it never turned into the panacea that was promised. I'm excited to try claude code, i even have a decent little personal project lined up for it but someone somewhere will always need to understand the code because tests are never 100% exhaustive and major flaws come up.

two_tasty•6mo ago

Ah yes, can't wait to tell my auditor / regulator "I don't understand the code because Claude wrote it, but it's fine, because understand the code is for boomers." That will get a big laugh in a deposition.

intrasight•6mo ago

That'll be anachronistic too obviously. Your tests will be audited.

jimbokun•6mo ago

If the tests are written with sufficient detail that you don't need to look at the code, the implementation of the code is such a small part of the overall work that you are gaining very little in terms of overall productivity.

intrasight•6mo ago

I agree

jimbokun•6mo ago

I would say yes.

To have useful tests, you must write the APIs for the functions, and give examples of how to wire up the various constructs, and correct input/output pairs.

Implementations of those functions that pass the test now have significant constraints that mean you understand a lot about it.

theshrike79•6mo ago

That’s called Test Driven Development.

First you write the tests, then you write code until tests pass.

NeutralCrane•6mo ago

How do you know your own handwritten 20k lines of code have no bugs, or that 20k lines of code written by coworkers have no bugs?

actinium226•6mo ago

I'm not the person you're replying to, but I have a lot more confidence in my own 20k lines of code than an AIs. I've built up skills to write performant, readable, functional, maintainable code. I build it up slowly and I can anticipate bugs as I write. I'm not perfect, but when bugs do arise, since I've built up the code, I have some idea of where to look and where not to look in order to fix them.

As for coworkers, I would really try to get them to work in chunks smaller than 20k loc. But at some point you have an expectation that coworkers will be accountable for their area of responsibility. If there's a bug in their code, they're expected to fix it. If there's a bug in the AIs code, I'm expected to fix it....

actinium226•6mo ago

And are you willing to stand behind those 20k loc? Like, whoever you're submitting it to, you can say "this is my work, it is done to a level of quality I find acceptable"?

ghuntley•6mo ago

I think you are perhaps missing the point. Investing into these techniques [2] enables you to do unhinged things. Such as building a compiler whilst you are AFK [1].

[1] https://x.com/i/broadcasts/1OyJALVOnEzGb

[2] https://ghuntley.com/ralph

moomoo11•6mo ago

Sure if I want to just toy around for fun.

Those are cool, but a production system is infinitely more complex.

ghuntley•6mo ago

What do you define as a production system? Are you aware that one can generate TLA+ specifications, then code generate from these specifications and assert that the implementation matches the TLA+ spec?

makeramen•6mo ago

You don't do it manually. You have claude do it once you’ve guided it back on track to remind itself not to do it next time.

libraryofbabel•6mo ago

Sigh. As others have commented, over and over again in the last 6 months we've seen discussions on HN with the same basic variation of "Claude Code [or whatever] is amazing" with a reply along the lines of "It doesn't work for me, it just creates a bunch of slop in my codebase."

I sympathize with both experiences and have had both. But I think we've reached the point where such posts (both positive and negative) are _completely useless_, unless they're accompanied with a careful summary of at least:

* what kind of codebase you were working on (language, tech stack, business domain, size, age, level of cleanliness, number of contributors)

* what exactly you were trying to do

* how much experience you have with the AI tool

* is your tool set up so it can get a feedback loop from changes, e.g. by running tests

* how much prompting did you give it; do you have CLAUDE.me files in your codebase

and so on.

As others pointed out, TFA also has the problem of not being specific about most of this.

We are still learning as an industry how to use these tools best. Yes, we know they work really well for some people and others have bad experiences. Let's try and move the discussion beyond that!

reactordev•6mo ago

Seconded, that a summary description of your problem, codebase, programming dialect in use, should be included whenever a “<Model> didn’t work for me” response.

dejavucoder•6mo ago

Fair point.

For context, I was using Claude Code on a Ruby + Typescript large open source codebase. 50M+ tokens. They had specs and e2e tests so yeah I did have feedback when I was done with a feature - I could run specs and Claude Code could form a loop. I would usually advise it to fix specs one by one. --fail-fast to find errors fast.

Prior to Claude Code, I have been using Cursor for an year or so.

Sonnet is particularly good at NextJS and Typescript stuff. I also ran this on a medium sized Python codebase and some ML related work too (ranging from langchain to Pytorch lol)

I don't do a lot of prompting, just enough to describe my problem clearly. I try my best to identify the relevant context or direct the model to find it fast.

I made new claude.md files.

zer00eyz•6mo ago

I spend a fair amount of time tinkering in Home Assistant. My experience with that platform and LLM's can be summed up as "this is amazing".

I also do a fair amount of data shuffling with Golang. My LLM experience there is "mixed".

Then I deal with quite a few "fringe" code bases and problem spaces. There LLM's fall flat past the stuff that is boiler plate.

"I work in construction and use a hammer" could mean framer, roofer or smashing out concrete with a sledge. I suspect that "I am a developer, I write code" plays out in much the same way, and those details dictate experience.

Just based on the volume of ruby and typescript, and the overlap of the output of these platforms your experience is going to be pretty good. I would be curious if you went and did something less mainstream, and in a less common language (say Zig) if you would have the same feelings and feedback that you do now. Based on my own experience I suspect you would not.

oblio•6mo ago

Speaking of that observation about "fringe": this will probably, increasingly, be a factor, let's call it LLMO (optimization), where "LLM friendly" content will be pushed. So I expect secondary or fringe programming languages to become even more pushed aside, since LLMs will not be as useful.

Which is, obviously, sad. Especially since the big winner is Javascript, a language that's still subpar as far as programming languages go.

state_less•6mo ago

Here's a few general observations.

Your LLM (CC) doesn't have your whole codebase in context, so it can run off and make changes without considering that some remote area of the codebase are (subtly?) depending on the part that claude just changed. This can be mitigated to some degree depending on the language and tests in place.

The LLM (CC) might identify a bug in the codebase, fix it, and then figure, "Well, my work here is done." and just leave it as is without considering ramifications or that the same sort of bug might be found elsewhere.

I could go on, but my point is to simply validate the issues people will be having, while also acknowledging those seeing the value of an LLM like CC. It does provides useful work (e.g. large tedious refactors, prototyping, tracking down a variety of bugs, and so on...).

simonw•6mo ago

Right, which is why having a comprehensive test suite is such an enormous unlock for this class of technology.

If your tests are good, Claude Code can run them and use them to check it hasn't broken any distant existing behavior.

dawnerd•6mo ago

Not always the case. It’ll just go and “fix” the tests to pass instead of fixing the core issue.

simonw•6mo ago

That used to happen a whole lot more. Recent Claudes (3.7, 4) are less likely to do that in my experience.

If they DO do that, it's on us to tell them to undo that and fix things properly.

dejavucoder•6mo ago

Can probably give access to tools like ast-grep to Claude. Will help it see all references. I still agree some dynamic references might still be left. Only way is to prompt well enough. Since I tested this on a Ruby on Rails codebase, I dealt with this.

theshrike79•6mo ago

This is why you keep CLAUDE.md updated, there it’ll write down what is where and other relevant info about the project.

Then it doesn’t need to feel (or rg) through the whole codebase.

You also use plan mode to figure out the issue, write the implementation plan in a .md file. Clear context, enter act mode and tell it to follow the plan.

imiric•6mo ago

It's telling that you ask these details from a comment describing a negative experience, yet the top-most comment full of praises and hyperbole is accepted at face value. Let's either demand these things from both sides or from neither. Just because your experience matches one side, doesn't mean that experiences different from yours should require a higher degree of scrutiny.

I actually think it's more productive to just accept how people describe their experience, without demanding some extensive list of evidence to back it up. We don't do this for any other opinion, so why does it matter in this case?

> Let's try and move the discussion beyond that!

Sharing experiences using anecdotal evidence covers most of the discussion on forums. Maybe don't try to police it, and either engage with it, or move on.

libraryofbabel•6mo ago

I should have been clearer - I'd like to see this kind of information from positive comments as well. It's just as important. If someone is having success with Claude Code while vide-coding a toy app, I don't care. If they're having success with it on a large legacy codebase, I want them to write a blog post all about what they're doing, because that's extremely useful information.

imiric•6mo ago

I jumped the gun a bit in my comment, since you did mention you want to see this from both sides. So it was clear, and I apologize.

The thing is that I often read this kind of response only to comments with negative experiences, while positive ones are accepted as fact. You can see this reinforced in the comments here as well. A comment section is not the right place to expand on these details, but I agree that blog posts should have them, regardless of the experience type.

gilfoy•6mo ago

It’s telling that they didn’t specifically address it at the negative experience and you filled that in yourself

rounce•6mo ago

It was the comment they replied to. If it was a general critique of the state of discourse around agentic tools and Claude Code in particular why not make it a top level comment?

libraryofbabel•6mo ago

Oh, because I wanted to illustrate that the discourse is exemplified by the pair of the GP comment (vague and positive) and the parent comment (vague and negative). Therefore I replied to the negative parent comment.

serf•6mo ago

>Let's either demand these things from both sides or from neither. Just because your experience matches one side, doesn't mean that experiences different from yours should require a higher degree of scrutiny.

Sort of.

The people that are happy with it and praising the avenues offered by LLM/AI solutions are creating codebases that fulfill their requirements, whatever those might be.

The people that seem to be unhappy with it tend to have the universal complaints of either "it produces garbage" , or "I'm slower with it.".

Maybe i'm showing my age here, but I remember these same exact discussions between people that either praised or disparaged search engines. The alternative being an internet Yellowpages (which was a thing for many years.)

The ones that praised it tended to be people who were taught or otherwise figured out how to use metadata tags like date:/onsite: , whereas the ones that disparaged it tended to be the folks who would search for things like "who won the game" and then proceed to click every scam/porno link on this green Earth and then blame Google/gdg/lycos/whatever when they were exposed to whatever they clicked.

in other words : proof is kind of in the pudding.

I wouldn't care about the compiler logs from a user that ignored all syntax and grammar rules of a language after picking it up last week, either -- but it's useful for successful devs to share their experiences both good and bad.

I care more about the opinions of those that know the rules of the game -- let the actual teams behind these software deal with the user testing and feedback from people that don't want to learn conventions.

oblio•6mo ago

> The ones that praised it tended to be people who were taught or otherwise figured out how to use metadata tags like date:/onsite: , whereas the ones that disparaged it tended to be the folks who would search for things like "who won the game" and then proceed to click every scam/porno link on this green Earth and then blame Google/gdg/lycos/whatever when they were exposed to whatever they clicked.

One big warning here: search engines only became really useful when you could search for "who won the game" and the search engine actually returned the correct thing as the top result.

We're more than a quarter of a century later and probably 99.99% of users don't know about Google's advanced search operators.

This should be a major warning for LLMs. People are people and will do people things.

imiric•6mo ago

> The people that are happy with it and praising the avenues offered by LLM/AI solutions are creating codebases that fulfill their requirements, whatever those might be.

Ah, but "whatever those might be" is the crucial bit.

I don't entirely disagree with what you're saying. There will always be a segment of power users who are able to leverage their knowledge about these tools to extract more value out of them than people who don't use them to their full potential. That is true for any tool, not just in software.

What you're ignoring are two other possibilities:

1. The expectation of users can be wildly different. Someone who has never programmed before, but can now create and ship a smartphone app, will see these tools as magical. Whatever issues they have will either go unnoticed, or won't matter considering the big picture. Surely their impression of AI tooling will be nothing short of positive. They might be experts at using LLMs, but not at programming.

OTOH, someone who has been programming for decades, and strives for a certain level of quality in their work, will find the experience much different. They will be able to see the flaws and limitations of these tools, and addressing them will take time and effort that they could've better spent elsewhere. As we've known since the introduction of LLMs, domain experts are the only ones who can experience these problems.

So the experience of both sides is valid, and should have equal weight in conversations. Unlike you, I do trust the opinion of domain experts over those of user experts, but that's a personal bias.

2. There are actual flaws and limitations in AI tooling. The assumption that all negative experiences are from users who are "holding it wrong", while all positive ones are from expert users, is wrong. It steers the conversation away from issues with the tech that should be discussed and addressed. And considering the industry is strongly propelled by hype and marketing right now, we need conversations grounded in reality to push back against it.

Aeolun•6mo ago

> The assumption that all negative experiences are from users who are "holding it wrong", while all positive ones are from expert users, is wrong.

I’m not sure about that. I feel like someone experienced would realize when using the LLM is a better idea than doing it themselves, and when they just need to do it by hand.

You might work in a situation where you have to do everything by hand, but then your response would be to the extent that you can see how it’s useful to other people.

leptons•6mo ago

>But I think we've reached the point where such posts (both positive and negative) are _completely useless_, unless they're accompanied with a careful summary of at least:

They did mention "(both positive and negative)", and I didn't take their comment to be one-sided towards the AI-negative comments only.

muzani•6mo ago

They're tools. To a fluent tool user, the negative anecdotes sound like,

"I prefer typewriters over word processors because it's easier to correct mistakes."

"I don't own any forks because knives are just better at cutting bread."

"Bidets make my pants wet, so I'll keep to toilet paper."

I think there's an urge to fix misinformation. Whereas if someone loves Excel and thinks Excel is better than Java at making apps, I have no urge to correct that. Maybe they know something about Excel that I don't.

0x457•6mo ago

There are some tasks that it can fail and not, but a lot of "Claude Code [or whatever] is amazing" with a reply along the lines of "It doesn't work for me, it just creates a bunch of slop in my codebase." IMO is "i know how to use it" vs "I don't know how to use it" with a side of "I have good test coverage" vs "tests?"

rstuart4133•6mo ago

> But I think we've reached the point where such posts (both positive and negative) are _completely useless_, unless they're accompanied with a careful summary of at least ...

I use Claude many times a day, I ask it and Gemini to generate code most days. Yet I fall into the "I've never included a line of code generated by an LLM in committed code" category. I haven't got a precise answer for why that is so. All I can come up with is the code generated lacks the depth of insight needed to write a succinct, fast, clear solution to the problem someone can easily understand in in 2 years time.

Perhaps the best illustration of this is someone proudly proclaimed to be they committed 25k lines in a week, with the help of AI. In my world, this sounds like they are claiming they have a way of turning the sea into ginger beer. Gaining the depth of knowledge required to change 25k lines of well written code would take me more than a week of reading. Writing that much in a week is a fantasy. So I asked them to show me the diff.

To my surprise, a quick scan of the diff revealed what the change did. It took me about 15 minutes to understand most of it. That's the good news.

The bad news it that 25k lines added 6 fields to a database. 2/3's were unit tests, perhaps 2/3's of the remainder was comments (maybe more). The comments were glorious in their length and precision, littered with ASCII art tables showing many rows in the table.

Comments in particular are a delicate art. They are rarely maintained, so they can bit rot in downright misleading babble after a few changes. But the insight they provide into what author was thinking, and in particular the invariants he had in mind can save hours of divining it from the code. Ideally they concisely explain only the obscure bits you can't easily see from the code itself. Anything more becomes technical debt.

Quoting Woodrow Wilson on the amount of time he spent preparing speeches:

    “That depends on the length of the speech,” answered the President. “If it is a ten-minute speech it takes me all of two weeks to prepare it; if it is a half-hour speech it takes me a week; if I can talk as long as I want to it requires no preparation at all. I am ready now.”

Which is a round about way of saying I suspect the usefulness of LLM generated code depends more on how often a human is likely to read it, than of any of the things you listed. If it is write once, and the requirement is it works for most people in the common cases, LLM generated code is probably the way to go.

I used PayPal's KYC web interface the other day. It looked beautiful, completely inline with the rest of PayPal's styling. But sadly I could not complete it because of bugs. The server refused to accept one page, it just returned to the same page with no error messages. No biggie, I phoned support (several times, because they also could not get past the same bug), and after 4 hours on the phone the job was done. I'm sure the bug will be fixed a new contractor. He spend an few hours on it, getting an LLM to write a new version, throwing the old code away, just as his predecessor did. He will say the LLM provided a huge productivity boost, and PayPal will be happy because he cost them so little. It will be the ideal application for an LLM - got the job done quickly, and no one will read the code again.

I later discovered there was a link on the page that allowed me to skip past the problematic page, so I could at least enter the rest of the information. It was in a thing that looked confusingly like a "menu bar" on the left, although there was no visual hit any of the items in the menu were clickable. I clicked on most of them anyway, but they did nothing. While on hold for phone support, I started reading the HTML and found one was a link. It was a bit embarrassing to admit to the help person I hadn't clicked that one. It sped the process up somewhat. As I said, the page did look very nice to the eye, probably partially because of the lack of clutter created by visual hints on what was clickable.

[0] https://quoteinvestigator.com/2012/04/28/shorter-letter/

QuantumGood•6mo ago

Agree. It keeps getting closer to "I've had a negative experience with the internet ..."

positron26•6mo ago

The framing has been rather problematic. I find these differences in premises are lurking below the conversations:

- Some believe LLMs will be a winner-take-all market and reinforce divergences in economic and political power.

- Some believe LLMs have no path of evolution and have therefore already plateaued and too low to be sustainable with these investments in compute, which would imply it's a flash in the pan that will collapse.

- Some believe LLMs will all be hosted forever, always living in remote services because the hardware requirements will always be massive.

- Some believe LLMs will create new, worse kinds of harm without enough offsetting creation of new kinds of defense.

- Some believe LLMs and AI will only ever give low-skilled people mid-skill results and therefore work against high-skill people by diluting mid-end value without creating new high-end value for them.

We need to be more aware of how we are framing this conversation because not everyone agrees on these big premises. It very strongly affects the views that depend on them. When we don't talk about these points and just judge and reply based on whether the conclusion reinforces our premises, the conversation becomes more political.

Confirmation bias is a thing. Individual interests are a thing. Some of the outcomes, like regulation and job disruption, depend on what we generally believe. People know this and so begin replying and voting according to their interests, to convince others to aid their cause without respect for the truth. This can be counter-productive to the individual if they are wrong about the premises and end up pushing an agenda that doesn't even actually benefit them.

We can't tell people not to advance their chosen horse at every turn of a conversation, but those of us who actually care about the truth of the conversation can take some time to consider the foundations of the argument and remind ourselves to explore that and bring it to the surface.

flir•6mo ago

I find it telling that I have (mostly) good experiences with the GPT family and (mostly) bad experiences with the Claude family.

I just wish I could figure out what it tells. Their training data can't be that different. The problems I'm feeding them are the same. Many people think Claude is the more capable of the two.

It has to be how I'm presenting the problems, right? What other variable is there?

matwood•6mo ago

If you have been using GPT for awhile it simply may know more about you.

actinium226•6mo ago

I'm not convinced that "we know they work really well for some people." So far I just see people really excited about the potential and really impressed at what it's capable of, but I think people are extrapolating poorly. It's like, yes it's impressive that it can make a video game with a few prompts, but that doesn't mean that with a few more prompts it'll turn into a AAA game.

I'm on board with some limited AI autocompletion, but so far agents just seem like gimmicks to me.

fragmede•6mo ago

If we handwave that the popular game Wordle, which made a lot of money for its author, could have been vibecoded, at what point does the gimmick become an actual feature that people look and pay for?

actinium226•6mo ago

No shade at wordle, but what you're describing sounds like it would be useful for the shovelware industry and that's about it. Not exactly a great leap forward for humanity...

Although I should be fair, this can help with one-off scripts that research folks usually do, when you just need to plot some data or do some back-of-the-terminal math. That said I don't think this would be a game changer, more of an efficiency boost and a limited one at that.

fragmede•6mo ago

What would a great leap forwards for humanity look like? Sure, making it easier to shovel out shovelware means more shovelware, but why is that a bad thing? If customers have a very specific problem that wasn't going to get solved because it was too expensive to build a custom solution, and they now get to have bespoke software to cure their ills, other than being judgemental about this hypothetical piece of software as being shovelware, why is that a bad thing?

actinium226•6mo ago

Here's one version of what a great leap forward could look like, but it's simply one of many: an LLM that understand the CPU it's running on and can turn prompts into assembly, taking full advantage of the hardware. Or maybe it could target a virtual CPU like Java, but the point is that if the LLM can write code, why do it in Python or C? Just let it understand the CPU and let it rip. The only reason we have C/Python/etc. in the first place is because assembly sucks for humans to work with.

As to the shovelware, if it benefits people that's great, and I think the net benefit will likely be positive, but only slightly. The point in calling it shovelware is to suggest that it's low quality, and so it could have bugs and other performance issues that add costs to using which subtract from the benefit it provides (possibly in a net positive way, but probably not as fundamentally game changing as, say, Docker).

wyldfire•6mo ago

I have seen both success and failure. It's definitely cool and I like to think of it as another perspective for when I get stuck or confused.

When it creates a bunch of useless junk I feel free to discard it and either try again with clearer guidelines (or switch to Opus).

leptons•6mo ago

Have you tried vibing harder?

baka367•6mo ago

For me it fixed a library compatibility issue with React 19 in 10 mins and several nudges startign from the console error and library name.

It would have been a half-day worth of adventure at least should i have done it myself (from diagnosing to fixing)

Philpax•6mo ago

> I go in and debug for a while because the app has become fubar, then finally realize it did the whole thing incorrectly and throw it all away.

This seems consistent with some of the more precocious junior engineers I've worked with (and have been, in the past.)

nzach•6mo ago

> take several minutes to do something

The quality of the generated code is inversely proportional to the time it take to generate it. If you let Claude Code work alone for more than 300 seconds you will receive garbage code. Take that as a hint, if it can't finish the task in this time it means you are asking too much. Break up your feature and try with a smaller feature.

pragmatic•6mo ago

Could you elaborate a bit on the tasks,languages,domain etc you’re using it with?

People have such widely varying experiences and I’m wondering why.

criddell•6mo ago

I haven't had great luck with Claude writing Windows Win32 (using MFC) in C++. It invents messages and APIs all the time that read like exactly what I want it to do.

I'd think Win32 development would be something AIs are very strong at because it's so old, so well documented, and there's a ton of code out there for it to read. Yet it still struggles with the differences between Windows messages, control notification messages, and command messages.

thegrim33•6mo ago

I find it pretty interesting that it's a roughly 2,500 word article on "using Claude Code" and they never once actually explain what they're using it for, what type of project they're coding. It's all just so generic. I read some of it then realize that there was absolutely no substance in what I just read.

It's also another in my growing list of data points towards my opinion that if an author posts meme pictures in their article, it's probably not an article I'm interested in reading.

kraftman•6mo ago

Yeah I got about half way through before thinking "wow theres no information in this" and giving up.

sensanaty•6mo ago

Always good to be cognizant that there are MANY people out there, especially on HN/YC circles, with large vested interests in LLM tooling. Just check out the YC batches lately, you'll be hard pressed to find a single one that doesn't mention AI or LLMs in some way.

vitaflo•6mo ago

It’s always telling when people don’t show their work. I’m not saying LLMs can’t do a good job but if you’re not even explaining the steps you used or showing the code that was generated or fixed then I have to assume what was really produced was unmaintainable spaghetti code that just happened to compile.

danielbln•6mo ago

I literally cannot share my code, as that belongs to the company or the client. So what would I show? My prompts? Pseudo code? Unless you're an open source developer or build personal projects, requests for "show your code, bro" are hard to satisfy.

leptons•6mo ago

That's fine, but then don't write an article about it if you can't show the code. The vagueness just makes the article look unsupported by facts.

_se•6mo ago

It's always POC apps in js or python, or very small libraries in other popular languages with good structure from the start. There are ways to make them somewhat better in other cases (automated testing/validation/linting being a big one), but for the type of thing that 95% of developers are doing day to day (working on a big, sprawling code base where none of those attributes apply), it's not close to being there.

The tools really do shine where they're good though. They're amazing. But the moment you try to do the more "serious" work with them, it falls apart rapidly.

I say this as someone that uses the tools every day. The only explanation that makes sense to me is that the "you don't get it, they're amazing at everything" people just aren't working on anything even remotely complicated. Or it's confirmation bias that they're only remembering the good results - as we saw with last week's study on the impact of these tools on open source development (perceived productivity was up, real productivity was down). Until we start seeing examples to the contrary, IMO it's not worth thinking that much about. Use them at what they're good at, don't use them for other tasks.

LLMs don't have to be "all or nothing". They absolutely are not good at everything, but that doesn't mean they aren't good at anything.

Herring•6mo ago

I like them for refactoring and “explain this massive codebase please”. Basically polishing or investigating things that already work.

But I think we should expect the scope of LLM work to improve rapidly in the next few years.

https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...

oblio•6mo ago

The bad news is that mostly, as far as we can see, that doubling of performance also requires (at least) doubling of resource usage, plus we're getting close to a point where planetary resources for doubling LLM resources are getting kind of low...

Herring•6mo ago

This species is going extinct. I finally accepted that when my dad died rather than change his lifestyle, despite being warned 10000x. My mom survived a heart attack, saw what happened to my dad, still hasn't changed her lifestyle.

ants_everywhere•6mo ago

> They're amazing. But the moment you try to do the more "serious" work with them, it falls apart rapidly.

Sorry, but this is just not true.

I'm using agents with a totally idiosyncratic code base of Haskell + Bazel + Flutter. It's a stack that is so quirky and niche that even Google hasn't been able to make it work well despite all their developer talent and years of SWEs pushing for things like Haskell support internally.

With agents I'm easily 100x more productive than I would be otherwise.

I'm just starting on a C++ project, but I've already done at least 2 weeks worth of work in under a day.

cpursley•6mo ago

What do you mean “with agents”?

ants_everywhere•6mo ago

I've been using mainly gemini-cli and am starting to play around with claude code.

cpursley•6mo ago

Are you referring to those as agents or do you mean spinning separate/multiple agents out of sessions on them?

_se•6mo ago

Share the codebase and what you're doing or, I'm sorry, you're just another example of what I laid out above.

If you honestly believe that "agents" are making you better than Goole SWEs then you severely need to take a step back and reevaluate, because you are wrong.

iammrpayments•6mo ago

I’m going to ask what I’ve asked the last person here who said they are “10-20x” more productive:

If you’re really that more productive, why don’t you quit your job and vibecode 10 ios apps (in your case that would be 50 to 100 proportionally)

Aeolun•6mo ago

Because money? Even if you can quickly build them it’s pointless if you can’t sell them. And Claude cannot help with that.

artisin•6mo ago

Hold the phone. So, Google, with its legions of summa cum laude engineers, can't make this stack work well, but your AI agent is nailing it into next week? Seriously, show me the way, so I too may find AI enlightenment.

Aeolun•6mo ago

Hmm, I got Claude Opus to build me a game in Rust. I don’t think it really counts as POC app any more at that point.

_se•6mo ago

It absolutely counts as a POC app until it's production grade, deployed, being used by people, maintained over time, etc.

This doesn't mean that it's not useful, or that you shouldn't be happy with what the LLM built. I also had Claude Code build me a web app for my own personal use in Rust this week. It's very useful to me. But it is 100% of POC/MVP quality, and always will be, because the code that it created is abjectly awful and I would never be able to scale it into a real world service without rewriting 50+% of it.

rr808•6mo ago

I opinion is that the AI is the distilled average of all the code it can scrape. For the stuff I'm good at and work on every day it doesn't help much beyond some handy code completions. For stuff I'm below average at like bash commands and JS it helps me get up to average. The most valuable to me is if I can use it to learn something - it gives some good alternatives and ideas if you have something mainstream.

iambateman•6mo ago

I’m a TALL developer, so Laravel, Livewire, Tailwind, Alpine.

It’s nice because 3/4 of those are well-known but not “default” industry choices and it still handles them very well.

So there’s a Laravel CRM builder called Filament which is really fun to work in. Claude does a great job with that. It’s a tremendous amount of boilerplate with clear documentation, so it makes sense that Claude would do well.

The thing I appreciate though is that CC as an agent is able to do a lot in one go.

I’ve also hooked CC up to a read-only API for a client, and I need to consume all of the data on that API for the transition to a Filament app. Claude is currently determining the schema, replicating it in Laravel, and doing a full pull of API records into Laravel models, all on its own. It’s been running for 10 minutes with no interruption and I expect will perform flawlessly at that.

I invest a lot of energy in prompt preparation. My prompts are usually about 200 words for a feature, and I’ll go back and forth with an LLM to make sure it thinks it’s clear enough.

William_BB•6mo ago

The reason is probably complexity and the task at hand.

In my experience, LLMs are great at small tasks (bash or python scripts); good at simple CRUD stuff (js, ts, html, css, python); good at prototyping; good at documentation; okay at writing unit tests; okay at adding simple features in more complex databases;

Anything more complex and I find it pretty much unusable, even with Claude 4. More complex C++ codebases; more niche libraries; ML, CV, more mathsy domains that require reasoning.

ModernMech•6mo ago

When I read these LLM in coding discussions, I'm reminded a lot of online dating discussions. Someone will post "I'm really having a tough time dating. I tried X (e.g. a dating app) but I had a tough experience." Someone will respond "I tried X and had great success. Highly recommended." Seems confounding, but when you click into their profiles to see pictures of each person, it becomes abundantly clear why these people report different experiences.

Not to dog the author too hard, but a look at their Github profile says a lot about the projects they've worked on and what kind of dev they are. Not much there in terms of projects or code output, but they do have 15k followers on Twitter, where they post frequently about LLMs to their audience.

They aren't talking about the tasks and the domains they're using because that's incidental; what they really want to do is just talk about LLMs to their online audience, not ship code.

kbuchanan•6mo ago

I've had the same experience, although I feel like Claude is far more than a junior to me. It's ability to propose options, make recommendations, and illustrate trade-offs is just unreal.

gjsman-1000•6mo ago

> It’s the first time it really does feel like working with a junior engineer to me.

I have mixed feelings; because this means there’s really no business reason to ever hire a junior; but it also (I think) threatens the stability of senior level jobs long term, especially as seniors slowly lose their knowledge and let Claude take care of things. The result is basically: When did you get into this field, by year?

I’m actually almost afraid I need to start crunching Leetcode, learning other languages, and then apply to DoD-like jobs where Claude Code (or other code security concerns) mean they need actual honest programmers without assistance.

However, the future is never certain, and nothing is ever inevitable.

Quarrelsome•6mo ago

> because this means there’s really no business reason to ever hire a junior

aren't these people your seniors in the coming years? Its healthy to model an inflow and outflow.

toomuchtodo•6mo ago

The pipeline dries up when orgs would rather get the upfront savings of gen AI productivity gains versus invest in talent development.

kimixa•6mo ago

It's a junior engineer that doesn't learn - they make the same mistakes even after being corrected the second that falls out their context window (even often with "corrections" still there...), they struggle to abstract those categories of mistakes to avoid making similar ones in the future, and (by the looks of it) will never be "the senior". "Hiring a Junior" should really be seen as an investment more than immediate output.

I keep being told that $(WHATEVER MODEL) is the greatest thing ever, but every time I actually try to use them they're of limited (but admittedly non-zero) usefulness. There's only so many breathless blogs or comments I can read that just don't mesh with the reality I personally see.

Maybe it's sector? I generally work on Systems/OS/Drivers, large code bases in languages like C, C++ and Rust. Most larger than context windows even before you look at things like API documentation. Even as a "search and summarizer" tool I've found it completely wrong in enough cases to be functionally worthless as the time required to correct and check the output isn't a saving. But they can be handy for "autocompletion+" - like "here's a similar existing block of code, now do the same but with (changes)".

They generally seem pretty good at being like a template engine on non-templated code, so thing like renaming/refactoring or similar structure recognition can be handy. Which I suspect might also explain some of those breathless blog posts - I've seen loads which say "Any non-coder can make a simple app in seconds!" - but you could already do that, there's a million "Simple App Tutorial" codebases that would match whatever license you want, copy one, change the name at the top and you're 99% of the way to the "Wow Magic End Result!" often described.

moomoo11•6mo ago

We are using probabilistic generators to output what should be deterministic solutions.

danielbln•6mo ago

You know what else is probabilistic? You and me. That's why we have tooling in place to mitigate that and to constrain our variable outputs into more reliable, deterministic results. And luckily, a lot of that tooling can be used for probabilistic machines as well.

Philpax•6mo ago

> DoD-like jobs where Claude Code (or other code security concerns) mean they need actual honest programmers without assistance.

Then they'll just get a contract to spin up a DoD-secure variant: https://www.anthropic.com/news/anthropic-and-the-department-...

ModernMech•6mo ago

DoD will probably be requiring use of Mechahitler soon enough.

apwell23•6mo ago

half the posts on hackernews is same discussion over and over about coding agent usefulness or lack of

ivanech•6mo ago

Just got it at work today and it’s a dramatic step change beyond Cursor despite using the same foundation models. Very surprising! There was a task a month ago where AI assistance was a big net negative. Did the same thing today w/ Claude Code in 20ish minutes. And for <$10 in API usage!

Much less context babysitting too. Claude code is really good at finding the things it needs and adding them to its context. I find Cursor’s agent mode ceases to be useful at a task time horizon of 3-5 minutes but Claude Code can chug away for 10+ minutes and make meaningful progress without getting stuck in loops.

Again, all very surprising given that I use sonnet 4 w/ cursor + sometimes Gemini 2.5 pro. Claude Code is just so good with tools and not getting stuck.

iambateman•6mo ago

Cool! If you're on pro, you can use a _lot_ of claude code without paying for API usage, btw.

bn-l•6mo ago

Even though it’s the same model cursor adds a massive system prompt to every request. And it’s shit and lobotomises the models. After the rug pull I’m exclusive Claude code at the end of my billing period or when cursor cut me off the $60 a month plan—-which will probably come first—-a bit over halfway into my month.

dude250711•6mo ago

How are you guys happy with an 80-s looking terminal interface is beyond me...

If Claude is so amazing, could Anthropic not make their own fully-featured yet super-performant IDE in like a week?

yoyohello13•6mo ago

Free yourself from the shackles of the GUI.

dude250711•6mo ago

Free yourself from the shackles or displays too? Back to punched cards?

kobe_bryant•6mo ago

in what sense, instead of doing your job which I assume you've been doing successfully for many years you now ask Claude to do it for you and then have to review it?

giancarlostoro•6mo ago

I am loving the Zed editor and they integrate Claude primarily so I might give it a shot.

komali2•6mo ago

Does anyone have any usage guides they can recommend to feel this way about using Claude code, other than the OP article? I fired it up yesterday for about an hour and tried it on a couple tickets and it felt like a total waste of time. The answers it gave were absurdly incorrect - I was being quite specific in my prompting and it seemed to be acquiring the proper context, but just doing nothing like what I was asking.

E.g. I asked it to swap all on change handlers in a component to modify a use State rather than directly fire a network request, and then add on blurs for the actual network request. It didn't add use states and just added on blurs that sent network requests to the wrong endpoint. Bizarre.

satisfice•6mo ago

Are you doing anything useful? How can anyone outside of yourself know this?

My own experiments only show that this technology is unreliable.

ern•6mo ago

I liked Claude Code when I used it initially to document a legacy codebase. The developer who maintains the system reviewed the documentation, and said it was spot-on.

But the other day I asked it to help add boundary logging to another legacy codebase and it produced some horrible, duplicated and redundant code. I see these huge Claude instruction files people share on social media, and I have to wonder...

Not sure if they're rationing "the smarts" or performance is highly variable.

polishdude20•6mo ago

I found cursor much better than Claude Code. Running Claude code it did so many commands and internal prompting to get a small thing done and ate up tonnes of my quota. Cursor on the other hand did it super quick and straight to the point. Claude code just got stuck in grep hell

datpuz•6mo ago

Just wait til the honeymoon period ends and you actually have to stand behind that slop you didn't realize you were dumping into your codebase.

chisleu•6mo ago

Like working with an incredibly talented and knowledgable junior engineer, but still a junior engineer.

If you want to try something better than claude code, try Cline.

poszlem•6mo ago

Can you explain how cline is better?

chisleu•6mo ago

I love the interface. It makes it extremely easy to rewind time to undo code edits and rewinding the LLM context at the same time. It's prompting and toolset is great. It's got MCP which I have integrated into my workflow. It's got a solid marketplace of auto installing MCP services. I love it.

anothernewdude•6mo ago

I agree with the comparison to steroids, but then I've seen people go through the health issues caused by steroids so we might mean different things by that comparison.

poszlem•6mo ago

I agree. I also recommend people read this: https://docs.anthropic.com/en/docs/build-with-claude/prompt-...

There are some things in there that really take this from an average tool to something great. For example, a lot of people have no idea that it recognizes different levels of reasoning and allocates a bigger number of “thinking tokens” depending on what you ask (including using “ultrathink” to max out the thinking budget).

I honestly think that people who STILL get mostly garbage outputs just aren’t using it correctly.

Not to mention the fact that people often don't use Opus 4 and stay with Sonnet to save money.

Aeolun•6mo ago

> It’s the first time it really does feel like working with a junior engineer to me.

I feel like working with Claude is what it must feel like for my boss to work with me. “Look, I did this awesome thing!”

“But it’s not what I asked for…”

strtok•6mo ago

Unlike a junior engineer, its feelings don’t get hurt when you ask for a redo

actinium226•6mo ago

> it really does feel like working with a junior engineer to me.

I agree. It reminds me of this one junior engineer I worked with who produced awful code, and it would take longer to explain stuff to him than to just do it myself, let alone all the extra time I had to spend reviewing his awful PRs. I had hoped he would improve over time, but he took my PR comments personally and refused to keep working with me. At least Claude doesn't have an attitude.

wrs•6mo ago

There is a VS Code extension for Claude Code. It's hardly more than a terminal window really, but that in itself is pretty handy. If you do /ide to connect up the extension it does a few things, but not yet anything resembling the Cursor diff experience (much less the Cursor tab experience, which is the reason I still use it).

dejavucoder•6mo ago

I use Claude Code 50% of times with Cursor now due to the diff and tab. The extension is just a bit buggy sometimes otherwise I would use it much more. I hit some node related bugs today while searching stuff with it (forgot to report to Anthropic lol). Other bugs include a scroll stuttering.

mike1o1•6mo ago

Claude Code has pretty much replaced Copilot overnight for me, though I wish the VS Code plugin was a bit more integrated, as it's only a little bit more than a terminal, though I guess that's the point. I was hoping for syntax highlighting to match my editor and things like that (beyond just light/dark theme).

What I'd really want is a way to easily hide it, which I did quite frequently with Copilot as its own pane.

deeshee•6mo ago

It's great to see even the most hardcore developers who are not fond of change being happy with the latest releases related to AI-assisted development.

My workflow now boils down to 2 tools really - leap.new to go from 0 to 1 because it also generates the backend code w/ infra + deployment and then I pick it up in Zed/Claude Code and continue working on it.

ardit33•6mo ago

1.So far, it is great if you know what you want, and tell it exactly how you want it, and AI can help you on that (basically intern level work).

2. When you are in a new area, but you don't want to dive deep and just want something quick and it is not core of the app/service.

But, if you are experienced, you can see how AI can mess things up pretty quickly, hence for me it has been best used to 'fill in clear and well defined functionality' at peacemeal. Basically it is best for small bites, then large chunks.

deeshee•6mo ago

I agree. But it's also a mindset game. Experienced devs often approach AI with preconceptions that limit its utility - pride in "craftsmanship, control issues, and perfectionism can prevent seeing where AI truly shines. I've found letting go of those instincts and treating AI as a thought partner rather than just a code generator be super useful. The psychological aspects of how we interact with these tools might be as important as the technical ones.

Bunch of comments online also reflect how there's a lot of "butthurt" developers shutting things down with a closed mind - focusing only on the negatives, and not letting the positives go through.

I sound a bit philosophical but I hope I'm getting my point across.

financltravsty•6mo ago

What's your track record. What is your current scope of work for Claude Code?

This conversation is useless without knowing the author's skillset and use-case.

Quarrelsome•6mo ago

> pride in "craftsmanship, control issues, and perfectionism

I mean, do we really want our code base to not follow a coding standard? Or are network code not to consider failure or transactional issues? I feel like all of these traits are hallmarks of good senior engineers. Really good ones learn to let go a little but no senior is going to watch a dev automated or otherwise, circumvent six layers of architecture by blasting in a static accessor or smth.

Craftsmanship, control issues and perfectionism, tend to exist for readability, to limit entropy and scope, so one can be more certain of the consequences of a chunk of code. So to consider them a problem is a weird take to me.

RamblingCTO•6mo ago

> pride in "craftsmanship, control issues, and perfectionism

sounds like you can't code for shit. guidelines, standards, and formatting have developed for a reason. the reason is: less bugs and maintainability. you sound like the average cocky junior to me.

2sk21•6mo ago

I'm curious: Do you scrutinize every line of code that's generated?

cmrdporcupine•6mo ago

At first I dd not. Now I have learned I have to.

You have to watch Claude Code like a hawk. Because it's inconsistent. It will cheat, give up, change directions, and not make it clear to you that is what it's doing.

So, while it's not "junior" in capabilities, it is definitely "junior" in terms of your need as a "senior" to thoroughly review everything it does.

Or you'll regret it later.

bluetidepro•6mo ago

How are people using this without getting rate limited non stop? I pay for Claude Pro and I sometimes can’t go more than 5 prompts in an hour without it saying I need to wait 4 hours for a cooldown. I feel like I’m using it wrong or something, it’s such a frustrating experience. How do you give it any real code context without using all your tokens so quickly?

terhechte•6mo ago

you need the max plan to break free of most rate limits

bluetidepro•6mo ago

I wish there was a Max trial (while on Pro) to test if this was the case or not. Even if it was just a 24 hour trial. Max is an expensive trigger to pull, and hope it just solves this.

cmrdporcupine•6mo ago

FWIW I went Claude Max after Pro, and the trick is to turn off Opus. If you do that you can pretty much use Sonnet all working day in a normal session. I don't personally find Opus that useful, and it burns through quota at 5x the speed of Sonnet.

wahnfrieden•6mo ago

It is typical to buy 2-3 Max tier plans for sustained Opus use

tomashubelbauer•6mo ago

I have the same issue and in recent days I seem to have gotten an extra helping of overload errors which hit extra hard when I realize how much this thing costs.

Edit: I see a sibling comment mention the Max plan. I wanna be clear that I am not talking about rate limits here but actual models being inaccessible - so not a rate limit issue. I hope Anthropic figures this out fast, because it is souring me on Claude Code a bit.

mbrumlow•6mo ago

No clue. I use it for hours on end. Longest run cost me $30 in tokens. I think it was 4 hours of back and forth.

Here is an example of chat gpt, followed by mostly Claude that finally solved a backlight issue with my laptop.

https://github.com/mbrumlow/lumd

singron•6mo ago

I haven't used Claude Code a lot, but I was using about $2-$5/hour, but it varied a lot. If I used it 6 hours/day and worked a normal 21 workday month (126 hours), then I would rack up $250-$630/month in API costs. I think I could be a more efficient with practice (maybe $1-$3/hour?). If you think you are seriously going to use it, then the $100/month or $200/month subscriptions could definitely be worth it as long as you aren't getting rate limited.

If you aren't sure whether to pull the trigger on a subscription, I would put $5-$10 into an API console account and use CC with an API key.

manmal•6mo ago

Try giving it a repomap, eg by including it in CLAUDE.md. It should pull in less files (context) that way. Exactly telling it which files you suspect need editing also helps. If you let it run scripts, make sure to tell it to grep out only the relevant output, or pipe to /dev/null.

ndr_•6mo ago

I had success through Amazon Bedrock on us-east1 during European office hours. Died 9 minutes before 10 a.m. New York time, though.

SwiftyBug•6mo ago

I've been using it pretty heavily and never have I been rate limited. I'm not even on the Pro Max plan.

cmrdporcupine•6mo ago

Claude Max, honestly. Worth it to me.

stavros•6mo ago

Are you using Opus?

ToJans•6mo ago

Whenever I'm rate limited (pro max plan), I stop developing.

For anything but the smallest things I use claude code...

And even then...

For the bigger things, I ask it to propose to me a solution (when adding new features).

It helps when you give proper guidance: do this, use that, avoid X, be concise, ask to refactor when needed.

All in all, it's like a slightly autistic junior dev, so you need to be really explicit, but once it knows what to do, it's incredible.

That being said, whenever you're stuck on an issue, or it keeps going in circles, I tend to rollback, ask for a proper analysis based on the requirements, and fill in the details of necessary.

For the non-standard things (f.e. detect windows on a photo and determine the measurement in centimetres), you still have to provide a lot of guidance. However, once I told it to use xyz and ABC it just goes. I've never written more then a few lines of PHP in my life, but have a full API server with an A100 running, thanks to Claude.

The accumulated hours saved are huge for me, especially front-end development, refactoring, or implementing new features to see if they make sense.

For me it's a big shift in my approach to work, and I'd be really sad if I have to go back to the pre-AI area.

Truth to be told, I was a happy user of cline & Gemini and spent hundreds of dollars on API calls per month. But it never gave me the feeling Claude code gave me, the reliability for this thing is saving me 80% of my time.

dontlaugh•6mo ago

I still don’t get why I should want that.

I’ve mentored and managed juniors. They’re usually a net negative in productivity until they are no longer juniors.

quesera•6mo ago

My current working theory is this:

People who enjoy mentoring juniors are generally satisfied with the ROI of iterating through LLM code generation.

People who find juniors sort-of-frustrating-but-part-of-the-job-sometimes have a higher denominator on that ROI calc, and ask themselves why they would keep banging their head against the LLM wall.

The first group is probably wiser and more efficient at multiplying their energies, in the long term.

I find myself in the second group. I run tests every couple months, but I'm still waiting for the models to have a higher R or a lower I. Any day now.

moomoo11•6mo ago

I'm cynical person, and IME the former are some of the most annoying and usually the worst engineers I've met.

Most people who "mentor" other people (like, make it a pride and distinction part of their identity) are usually the last people you want to take advice from.

Actual mentors are the latter group, who juniors seek out or look up to.

In other words, the former group is akin to those people on YouTube who try to sell shitty courses.

quesera•6mo ago

That's the extreme end of the first-group spectrum, but I definitely agree that they exist!

aniforprez•6mo ago

It's the complete opposite for me. I enjoy the process of mentoring juniors and am usually sought out for a lot of little issues like fixing git workflows or questions on how a process works. Working with an LLM is absolutely not what I want to do because I'd much rather mentees actually learn and ask me less and less questions. My experience with AI so far is that it never learns at all and it has never felt to me like a human. It pretends to be contrite and apologises for mistakes but it makes those mistakes anyway. It's the worst kind of junior who repeats the same mistake multiple times and doesn't bother committing those to memory.

quesera•6mo ago

You're right, I'm probably lumping the first group over-broadly, since I understand them less well.

It would make sense for there to be subgroups within the first group. It sounds like you prioritize results (mentee growth, possibly toward long-term contribution), and it's also likely that some people just enjoy the process of mentoring.

ToJans•6mo ago

It depends... I've worked with hundreds of juniors & seniors during my consulting days.

I've had ups and downs in this situation, but on most cases it's about showing the light to a path forward.

In most cases, the software development was straightforward, and most of the coaching was about a how to behave in the organisation they were functioning in.

One can only have so many architecture/code quality reviews, typically we evacuated the seniority of the devs on their ability to cope with people (colleagues, bosses, clients, ...)

We did have a few very bright technical people as well, but those were about 10 on a 2000-person company.

The reason I explicitly mentioned the slightly autistic junior person, is because I've worked with one, who was about to be fired, because other people had issues dealing with him.

So I moved desks, sat next to him for over a month, and he ended up becoming the champion for one of the projects we were doing, because he was very bright, precise and had a huge memory, which mattered a lot in that context.

Other stories are similar, once they were about to throw out a colleague because he was taking days to do something that should have taken a few hours max. So I say next to him, to see what he was doing.

Turned out he was refactoring all the code his feature touched because he couldn't stand bad code. So we moved him to quality control, and last time I checked he was thriving...

I guess what I'm saying is that -just like with people -, you need to find a good modus operandi, and have matching expectations, but if you can figure it out, it will pay off dividends.

erentz•6mo ago

There must at this point be lots and lots of actual walkthroughs of people coding using Claude Code, or whatever, and producing real world apps or libraries with them right? Would be neat to have a list because this is what I want to read (or watch), rather than people just continuously telling me all this is amazing but not showing me it’s amazing.

TheRoque•6mo ago

100% agree, I have been looking for a YouTube video or stream of someone leveraging AI to get their productivity boost, but I haven't found anything that made me think "okay, that really speeds up things"

aniforprez•6mo ago

It was extremely telling to me that the Zed editor team put out a video about using their AI interface and I don't remember what model they used but they asked it to add a new toggle for a feature and then spent half they video demonstrating their admittedly excellent review workflow to accept or reject the AI generated code and you could directly see how useless it was down to adding completely superfluous new lines randomly and the explanation was "it just does that sometimes"

I'm really not seeing these massive gains in my workflow either and maybe it's the nature of my general work but it's baffling how every use case for programming I'm seeing on YouTube is so surface level. At this point I've given up and don't use it at all

leptons•6mo ago

I asked the "AI" to do something relatively simple - fix when a button was enabled or disabled. And it then rewrote the entire 200+ line file, which then did not run, it was completely broken. Nice job "AI"!

FiberBundle•6mo ago

https://m.youtube.com/watch?v=XyQ4ZTS5dGw&pp=ygUZTWl0Y2hlbGw...

Not a "100x" boost, but a pretty good take on what tasks agents can do for even very good programmers.

umko21•6mo ago

I would like to see this list too. However, I guess people don't disclose they use AI when making stuff. AFAIK, this https://github.com/cloudflare/workers-oauth-provider repository is made with Claude code.

ookblah•6mo ago

this sounds like a cop out, but honestly you will probably see the most vocal on both sides of this here while vast majority are just quietly doing working and doing their stuff (ironic i'm writing this).

i feel like some kind of shill, but honestly i'm anywhere from 1.5x to 10x on certain tasks. the main benefit is that i can reduce a lot of cognitive load on tasks where they are either 1) exploratory 2) throwaway 3) boilerplate-ish/refactor type stuff. because of that i have a more consistent baseline.

i still code "by hand". i still have to babysit and review almost all the lines, i don't just let it run for hours and try to review it at the end (nightmare). production app that's been running for years. i don't post youtube videos bc i don't have the time to set it up and try to disprove the "naysayers" (nor does that even matter) and its code i can't share.

the caveat here is we are a super lean team so probably i have more context into the entire system and can identify problems early on and head them off. also i have a vested interest in increasing efficiency for myself wheras if you're part of a corpo ur probably doing more work for the same comp.

magicalist•6mo ago

> this sounds like a cop out

This may sound more mean than I intend, but your comment is exactly the kind of thing the GP post was describing as useless yet ubiquitous.

ookblah•6mo ago

i mean i get it, but at some point you have to either assume everyone is lying or we are all bots or something. it's probably the same feeling i get when i read someone having trouble and my first thought is like "they aren't using it right". i'm sure the reverse is something like "they aren't a real programmer" LOL.

jimbokun•6mo ago

It's much simpler than that.

Without concrete details about the exact steps you're taking, these conversations are heat without light.

ookblah•6mo ago

and without concrete details about the exact steps you're taking to fail these conversations are heat without light. see how that works?

nobody can be bothered to show how these coding llms fall flat on their face apparently with their own real detailed examples and people can't be bothered to setup some detailed youtube video with all source code because in the end i'm not trying that hard to convince people to use tools they don't want to use.

i think with all the comments maybe the more people who stumble through this the better. the cloudflare example is a decent starting point and i've already given you the general approach. i'm fine with that being a copout lol.

jimbokun•6mo ago

> and without concrete details about the exact steps you're taking to fail these conversations are heat without light.

Of course.

filoeleven•6mo ago

> i'm anywhere from 1.5x to 10x on certain tasks

By your own reckoning. There was that recent study showing how senior devs using AI thought they were 20% faster, when they were objectively 20% slower.

leptons•6mo ago

They may spend 20% less time writing code but 20% more time explaining to the AI what they want and fixing the output. People are not good at basic math or time accounting, even programmers.

ookblah•6mo ago

i mean at this point i don't really know what to say to comments like these basically calling me a liar lol. i log increases in sustained lines outputted and removed, files touched, features pushed per day. you can argue the level of increase, but it's definitely not 20% slower shrugs.

and this is exactly what i'm talking about. in the end who gives a shit about some study that may or may not apply to me as long its actually working.

people literally frothing at the mouth to tell people who find it useful that they are utterly wrong or delusional. if you don't like it then just go about your day, thx.

ikrenji•6mo ago

i wouldn't draw broad conclusions on this based on one study. especially since it's going against what many people self-report. studies are often flawed.

yoyohello13•6mo ago

Something really feels off about the whole thing. I use Claude code. I like it, it definitely saves me time reading docs or looking on stack overflow. It’s a phenomenal tool.

If we are to believe the hype though, shouldn’t these tools be launching software into the stratosphere? Like the CEO of stripe said AI tools provide a x100 increase in productivity. That was 3-4months ago. Shouldn’t stripe be launching rockets in to space now since that’s technically 400months of dev time? Microsoft is reportedly all in on AI coding. Shouldn’t Teams be the best, most rock solid software in existence now? There is so much hype around these tools being a super charger for more than a year, but the actual software landscape looks kind of the same to me as it did 3-4 years ago.

dmix•6mo ago

Whatever it produces still needs to be carefully reviewed and guided. Context switching as a human programmer is very hard so you need to focus on the same specific task, which is harder to not switch to social media or IRL while waiting for it. And you're going to be on the same branch for the same ticket, git doesn't let you do multiple at once (at least I'm not set up to). Not sure where the productivity scaling would come from outside of rapid experimentation on bad ideas until you find the right one, and of course rapid autocomplete, faster debugging, much fancy global find/replace type stuff.

I use it quite aggressively and I'd probably only estimate 1.5x on average.

Not world changing because we all mostly work on boring stuff and have endless backlogs.

swader999•6mo ago

I'm starting to lean into just the API tests, playwright tests and keeping the schema solid. I don't have to give a crap about what's in between all that. It's not that simple or extreme but I have a gut feeling we'll just have to get a SLA like mentality and then let it do it's thing to meet these.

deadbabe•6mo ago

Code is not the bottleneck.

AdieuToLogic•6mo ago

> Code is not the bottleneck.

Understanding the problem to solve is.

oldenlessons•6mo ago

Yup. Code is liability. Understanding it is the corresponding asset. Generating more code that is less well understood is akin to increasing leverage.

Or: buying a super car might make your commute feel faster. But if everyone did it, we'd have a lot more congestion and a lot more pollution.

uludag•6mo ago

Maybe this discrepancy is down to something like Claude code reducing the amount of brain power exhorted. If you have to do 80% less thinking to accomplish a task but the task takes just as long, you may (even rightfully) feel five times more productive even though output didn't change.

And is this a good thing since you can (in theory) multitask and work longer hours, or bad because you're acquiring cognitive debt (see "Your Brain on ChatGPT")?

christophilus•6mo ago

Exerted, fyi.

AdieuToLogic•6mo ago

Perhaps the hype is not intended for developer consumption, even though often worded as if it were, but instead meant for investors (such as VC's).

t0lo•6mo ago

Despite living in an age of supposedly transformative brainstorming and creative technologies, we’re also paradoxically inhabiting a time with less creativity and vision than ever. :)

oblio•6mo ago

Whaddaya mean, don't you like the 100th live action remake of an old animated movie?

hamandcheese•6mo ago

I imagine most CEOs are hyped about serving the same shit sandwich for less money, rather than serving up a tastier sandwich.

Or, as another commenter said, it's for investors, not developers, and certainly not the scum users.

swader999•6mo ago

I think it will. The Claude subscription only launched in May. A software company with a large code base, legacy code, paying customers,SOC, GDPR etc is a large boat to turn. I still think we are only months away from realizing the pace you describe

mrheosuper•6mo ago

IIRC most of the code in Claude is written by it(or some other LLM)

feynmanalgo•6mo ago

I see two trends recently: 1. low skill people using it for trivial projects. 2. devs writing out the whole app into memory/context down to the names of the files, interfaces, technologies, preparing the testing, compilation framework and then hand-holding the llm to arrive (eventually) at an acceptable solution. Example: https://www.youtube.com/watch?v=Y4_YYrIKLac

80% (99%?) of what you hear about llms are from the first group, amplified by influencers.

I'm guessing people feel the productivity boost because documenting/talking to/guiding/prompting/correcting an LLM is less mentally taxing than actually doing the work yourself even though time taken or overall effort is the same. They underestimate the amount of work they've had to put in to get something acceptable out of it.

christophilus•6mo ago

This is my take: I’ve been professionally developing software for 25-years. I use Claude Code daily and find it really useful. It does take a lot of effort— particularly in code review, which is my least favorite part of the job— in order to get useful, high quality results. For small, focused or boilerplate-heavy stuff, it is excellent, though. Also, the most fun part of programming (to me) is solving the problem; not so much actual implementation, so agents haven’t removed the fun yet.

swader999•6mo ago

I've just built a mobile app that does have a few sophisticated features. About 120 hours into it. (GPS, video manipulation, memory concerns) Never built on mobile before, I don't think I have the technical chops to do these hard parts on my own given the time it would have taken and the lack of focused blocks of time in my current life. It would take me a thousand hours without the Claude.

jimbokun•6mo ago

> 2. devs writing out the whole app into memory/context down to the names of the files, interfaces, technologies, preparing the testing, compilation framework and then hand-holding the llm to arrive (eventually) at an acceptable solution

Ah, Stone Soup: https://en.wikipedia.org/wiki/Stone_Soup

osn9363739•6mo ago

Let me know when you find the list. I want to see it.

rtp4me•6mo ago

I started using Claude code (CC) a couple weeks back and have some very positive outcomes. For clarity, I have been in the IT field since 1990, and my background is mainly infrastructure engineering (now DevOps). I don't write code professionally; I write tools as needed to accomplish tasks. That said, I understand end-to-end systems and the parts in the middle pretty well.

Here are some projects Claude has helped create:

1. Apache Airflow "DAG" (cron jobs) to automate dumping data from an on-prem PGSQL server to a cloud bucket. I have limited Python skills, but CC helped me focus on what I wanted to get done instead of worrying about code. It was an iterative process over a couple of days, but the net result is we now have a working model to easily perform on-prem to cloud data migrations. The Python code is complex with lots of edge conditions, but it is very readable and makes perfect sense.

2. Custom dashboard to correlate HAProxy server stats with run-time container (LXC) hooks. In this case, we needed to make sure some system services were running properly even if HAProxy said the container was running. To my surprise, CC immediately knew how to parse the HAProxy status output and match that with internal container processes. The net for this project is a very nice dashboard that tells us exactly if the container is up/down or some services inside the container are up/down. And, it even gives us detailed metrics to tell us if PGSQL replication is lagging too far behind the production server.

3. Billing summary for cloud provider. For this use case, we wanted to get a complete billing summary from our cloud provider - each VM, storage bucket, network connection, etc. And, for each object, we needed a full breakdown (VM with storage, network, compute pricing). It took a few days to get it done, but the result is a very, very nice tool that gives us a complete breakdown of what each resource costs. The first time I got it working 100%, we were able to easily save a few thousand $$ from our bill due to unused resources allocated long ago. And, to be clear, I knew nothing about API calls to the cloud provider to get this data much less the complexities of creating a web page to display the data.

4. Custom "DB Rebuild" web app. We run a number of DBs in in our dev/test network that need to get refreshed for testing. The DB guys don't know much about servers, containers, or specific commands to rebuild the DBs, so this tool is perfect. It provides a simple "rebuild db" button with status messages, etc. I wrote this with CC in a day or so, and the DB guys really like the workflow (easy for them). No need to Github tickets to do DB rebuilds; they can easily do it themselves.

Again, the key is focusing my energy on solving problems, not becoming a python/go/javascript expert. And, CC really helps me here. The productivity our team has achieved over the past few weeks is nothing short of amazing. We are creating tools that would require hiring expert coders to write, and giving us the ability to quickly iterate on new business ideas.

nzach•6mo ago

Aider is in large part written by AIs, they even have some stats here: https://aider.chat/HISTORY.html

Besides that, I think your best bet is to find someone on youtube creating something "live" using a LLM. Something like this: https://www.youtube.com/watch?v=NW6PhVdq9R8

tortila•6mo ago

After reading and hearing rave reviews I’d love to try Claude Code in my startup. I already manage Claude Team subscription, but AFAIK Code is not included, it only exists in Pro/Max which are for individual accounts. How do people use it as a subscription for a team (ideally with central billing)?

dukeyukey•6mo ago

You can use CC with AWS Bedrock, with all the centralised billing AWS offers. That's how my company handles it.

sixhobbits•6mo ago

Yeah this is super annoying and wish they'd fix it. I missed the asterisk and bought Claude Team for my team so they could use claude code but then saw it's excluded and had to go through a refund cycle and now they have to buy it individually.

kypro•6mo ago

HN has flipped so quickly on saying how AI produces unreliable slop, to most people using it to replace junior devs at their org – something I was heavily criticised for saying orgs should be doing a few months back.

Progress doesn't end here either, imo CC is more a mid-level engineer with a top-tier senior engineer's knowledge. I think we're getting to the point where we can begin to replace the majority of engineers (even seniors) for just a handful of seniors engineers to prompt and review AI produced code and PRs.

Not quite there yet, of course, but definitely feeling that shift starting now... There's going to be huge productivity boosts for tech companies towards the end this year if we can get there.

Exciting times.

hooverd•6mo ago

Where do the juniors come from?

throw1235435•6mo ago

I think the point of the poster is that there's no need for them because as he puts it replace "the majority of engineers". Tech is about to be decimated - software as a field is about to shrink quite substantially. To be honest the way things are going it is possible IMO (although would severely disadvantage me) - 3 years ago given the amount of skill required and lifetime learning in it I would of never thought this. Why do you need juniors - well you may need them but with the oversatuation it will take maybe 20 years (a generation to retire) for the current crop of seniors to get exhausted given AI will decimate them too according to the OP. That's definitely a far enough away time for it to be someone else's problem - by then AI might be significantly better anyway; we could even have superintelligence and SWE jobs might be the least of our worries.

Great for the owners of tech companies and non-tech companies using tech (many non-technical, capital based who now scale with less cost base); not so great for people with tech skills who invested for their future. Which is the goal of these AI companies. Your pain/unemployment/cost is their profit/economic surplus.

To the OP its "exciting times" for capital - probably for many people however it seems like an impending nightmare is coming especially if they are relying on intellectual work to survive/provide/etc. Most CC articles on HN trend I've seen trend to the top; which shows it is definitely instigating either greed or fear in people. I would argue that "code" seems to be the most successful AI product to date - most others are just writing assistants, meme generators, risky operators, etc at least for me. Its code that seems to instigate this persistent fear.

On a side note, at least for me, AI has personally dissuaded me from learning/investing more in this career as a senior engineer with almost 20 years experience. I hope I'm wrong.

dude250711•6mo ago

How come CC is a crappy terminal instead of some super-nice environment built by Anthropic via CC itself?

It should be capable of rebuilding VS Code but better, no?

nikcub•6mo ago

because their philosophy is that developers are tied to their editors and IDE's so you should be able to bring you're own. and they're right.

dude250711•6mo ago

Where are the awesome deeply-integrated add-ins/extensions for all popular IDEs then?

steveklabnik•6mo ago

Right now, they have ones for VS: Code and JetBrains' various ones: https://docs.anthropic.com/en/docs/claude-code/ide-integrati...

RamblingCTO•6mo ago

> CC is more a mid-level engineer with a top-tier senior engineer's knowledge.

care to provide any proof for that? in my experience it's neither.

Imanari•6mo ago

PSA: you can use CC with any model via https://github.com/musistudio/claude-code-router

The recent Kimi-K2 supposedly works great.

dejavucoder•6mo ago

thanks!

chrismustcode•6mo ago

I’d just use sst/opencode if using other models (I use it for Claude through Claude pro subscription too)

nxobject•6mo ago

Corollary if you're unfamiliar with how CC works (because you've never been able to consider it for its price, like me) – the CC client is freely available over 'npm'.

nikcub•6mo ago

> The recent Kimi-K2 supposedly works great.

My own experience is that it is below sonnet and opus 4.0 on capability - but better than gemini 2.5 pro on tool calling. It's really worth trying if you don't want to spend the $100 or $200 per month on Claude Max. I love how succinct the model is.

> you can use CC with any model via

Anthropic should just open source Claude Code - they're in a position to become the VS Code of cli coding agents.

Shout out to opencode:

https://github.com/sst/opencode

which supports all the models natively and attempts to do what CC does

Imanari•6mo ago

I tried gemini2.5 and while it is certainly a very strong model you really notice that it was not trained to be 'agentic'/with strong initiative for tool calling. Oftentimes it would make a plan, I'd say 'go ahead' and it just replied something like 'I made a todo list we are ready to implement' or something similar lol. You really had to push it to action and the whole CC experience fell apart a bit.

upcoming-sesame•6mo ago

I agree, Claude models are the most agentic oriented from the ones I've tried

upcoming-sesame•6mo ago

Where do you host Kimi-K2 ?

Imanari•6mo ago

You can use it via openrouter.ai

Imanari•6mo ago

gpt4.1 works surprisingly well although it is not as proactive as Sonnet.

graphememes•6mo ago

It's great for me. I have a claude.md at the root of every folder generally, outlined in piped text for minimal context addition about the rulesets for that folder, it always creates tests for what it's doing and is set to do so in a very specific folder in a very specific way otherwise it tries to create debug files instead. I also have set rules for re-use so that way it doesn't proliferate with "enhanced" class variants or structures and always tries to leverage what exists instead of bringing in new things unless absolutely necessary. The way I talk to it is very specific as well, I don't write huge prose, I don't set up huge PRDs and often I will only do planning if its something that I am myself unsure about. The only time I will do large text input is when I know that the LLM won't have context (it's newer than it's knowledge window).

I generally get great 1-shot (one input and the final output after all tasks are done) comments. I have moved past claude code though I am using the CLI itself with another model although I was using claude code and my reason for switching isn't that claude was a bad model it's just that it was expensive and I have access to larger models for cheaper. The CLI is the real power not the model itself per-se. Opus does perform a little better than others.

It's totally made it so I can do the code that I like to do while it works on other things during that time. I have about 60-70 different agent streams going at a time atm. Codebases sizes vary, the largest one right now is about 200m tokens (react, typescript, golang) in total and it does a good job. I've only had to tell it twice to do something differently.

leonidasv•6mo ago

Which models do you use instead of Anthropic ones?

I've only tried Claude Code with an external model once (Kimi K2) but it performed poorly.

graphememes•6mo ago

I'm using fine-tuned models some with 600b+ parameters and some with 1t+ kimi base / deepseek base and others are general purpose that are from huggingface but I use those through mcp tools

jatora•6mo ago

Can you list some of your agent streams you have going? Very curious

beoberha•6mo ago

What do you consider an “agent stream”? I can’t even imagine the cognitive overhead of managing 60-70 agents let alone the physical ability to churn through them as they complete their work and re-launch them.

jatora•6mo ago

The only way they have 60-70 agent streams is if their definition of an agent is ridiculous.

ChuckMcM•6mo ago

Reading this I can see these tools as training tools for software engineering managers.

ipaddr•6mo ago

What I wonder is how is the interview process now? Are they testing you with AI or without? Is leet code being asked with AI proving the answer?

Is there a bigger disconnect on how you are judged in an interview vs the job now?

How are the AI only developers handling this?

ct0•6mo ago

The projects you work on and the impact that they had. Hopefully.

otabdeveloper4•6mo ago

Same as it always was, you ask the candidate to explain his reasoning process in detail and ask questions along the way.

AI can't think or reason, LLMs are still mostly useless for gaming interviews.

jwpapi•6mo ago

Wait till he learns about aider

jamil7•6mo ago

I use both for different tasks, Aider is a sharp knife and claude code more of a blunt instrument.

to-too-two•6mo ago

Anyone using it for game dev? Like just having the agent try to build a game?

singron•6mo ago

I tried using aider with godot. CC would probably be better. Aider with 4o/o3-mini wasn't very good at gdscript, and it was terrible at editing tres/tscn files (which are usually modified through the editor). If you had a very code-centric game, it could turn out OK, but if you have resources/assets that you normally edit with special programs, it is going to struggle.

wahnfrieden•6mo ago

You should be using the best models. Try o3-pro

komali2•6mo ago

One thing I'm slightly anxious about in this new LLM world is whether the prices I'm paying are sustainable. I crank the fuck out of cursor and I think we're paying like 40 bucks a month for the privilege. Is this early Uber where it was unbelievable how cheap the rides were? In 2030 am I going to have gotten dependent on ai assisted levels of productivity, be making client and customer promises based on that expectation, but suddenly find myself looking at 1k+ bills, now that all the ai companies need to actually make money?

aerhardt•6mo ago

I am speaking out of intuition, but I think that part will be the quickest to be optimized. The current capabilities will cost pennies in a few years. I base this on the past trajectory but also on hearing very authorized people say that there is still room for a lot of optimizations across the stack.

djaychela•6mo ago

Can someone offer me some help? I've just been messing about "vibe coding" little python apps with local llm, continue and vscode. And I got so far with it.

Then I found the free tier of claude so I fed in the "works so far" version with the changes that the local llm made, and it fixed and updated all the issues (with clear explanation) in one go. Success!

So my next level attempt was to get all the spec and prompts for a new project (a simple manic miner style 2d game using pygame). 8 used chat gpt to craft all this and it looked sensible to me with appropriate constraints for different parts of the projrct.

Which claude created. But it keeps referring to a method which it says is not present in the code and that I'm running the wrong version. (I'm definitely not). I've tried indicating it by reference to the line number and the surrounding code but it's just gas lighting me.

Any ideas how to progress from this? I'm not expecting perfection, but it seems it's just taken me to a higher level before it runs into essentially the same issue as the local llm.

All advice appreciated, I'm just dabbling with this four a bit of fun when I can (I'm pretty unwell so do things as and when I feel up to it)

Thanks in advance.

postalcoder•6mo ago

It's likely you're running into "too deep into mediocre code with unclear interfaces and a lot of hidden assumptions hell" that LLMs are generally poor at handling. If you're running into an inextricable wall then it's better to do a controlled demolition.

ie, take everything written by chatgpt and have the highest-quality model you have summarize what the game does, and break down all the features in depth.

Then, take that document and feed it into claude. It may take a few iterations but the code you get will be much better than your attempt on iterating on the existing code.

Claude will likely zero-shot a better application or, at least, one that it can improve on itself.

If claude still insists on making up new features then install the context7 MCP server and ask it to use context7 when working on your request.

djaychela•6mo ago

Thanks.

I think I should have made it more clear in my post, the code is claude's and was done from scratch (the first app was a mandelbrot viewer which it added features to, this is a platfrom game).

It's a single file at the moment (I did give a suggested project structure with files for each area of responsibility) and it kind-of-works.

I think I could create the missing method in the class but wanted to see if it was possible by getting the tools to do it - it's as much of an experiment in the process and the end result.

Thanks for replying, I shall investigate what you've suggested and see what happens.

otabdeveloper4•6mo ago

> Any ideas how to progress from this?

You can't. This is a limitation of LLM technology. They can output the most likely token sequence, but if "likely" doesn't match "correct" for your problem then there's nothing you can do.

Also, each LLM has its own definition of what "likely" is - it comes from the training and finetuning secret sauce of that particular LLM.

hamandcheese•6mo ago

For me the best part about AI is that when I'm feeling lazy, I can tell the AI to do it. Whether it gives me gold or gives me shit, it doesn't matter, because I have now started my work.

crinkly•6mo ago

I go for a walk and have a coffee. It feels less intellectually dishonest. Best to solve human problems with human solutions.

rantallion•6mo ago

The best part is that you can do both. Give the agent a prompt, go for a walk, come back to see how well it did.

kaffekaka•6mo ago

Where in lies the dishonesty?

Having the AI just getting the wheels turning when I'm not in the mood myself has many times been a good way to make progress.

crinkly•6mo ago

When you say progress, I wonder at what cost. The apparatus for original thought isn't used and pushing through a problem and deriving the satisfaction of doing that doesn't happen. That leads to the wrong motivational cues. The brain is lazy and likes to take the easy route. Eventually you become merely a machine operator, which is something a lot of my colleagues are starting to do. And it's very worrying.

The biased behaviour towards just asking for a solution both devalues you and leads to any innovative outcomes disappearing before they had a chance to exist.

I've ended up being the company ideas man not because I'm good but everyone else stopped thinking.

jasoncartwright•6mo ago

I enjoy the idea that when programming before AI I wasn't 'merely a machine operator'

crinkly•6mo ago

I did most of my programming and thinking on paper before I committed to the machine. I still do today.

antonvs•6mo ago

But why do you use computers at all? Isn’t it intellectually dishonest?

crinkly•6mo ago

For communication and processing information quickly with determinism?

They aren't the best solution for everything. Thinking is one of those things.

leptons•6mo ago

The only time I ever wrote a program on paper was when I was 10 years old, a few months before my 11th birthday when I got my first computer. And that was in the late 1970s.

applied_heat•6mo ago

What is the niche?

I do control systems that have to work first try and after running the plant through its paces I’m out the door never to return

fsloth•6mo ago

I agree with you both. I feel LLM kills both procrastination AND capability for original thought.

But! There are some problems that are intrinsically interesting to me and others that are boring as hell but need to be solved.

In 20 years of work I’ve not come up with any way to make those boring tasks psychologically _not_ so disagreeable they are close to painfull.

But now the LLM can do whatever crap it does for those and I _finally_ can focus just on the interesting bits.

So, a boring task - off to the LLM you go - interesting - it’s mine, all mine!

danielbln•6mo ago

> So, a boring task - off to the LLM you go - interesting - it’s mine, all mine!

Yes! And hell, checking out how the machine solved a boring thing can still be a source of interest in itself, so the dopamine squeezing novelty keeps on flowing all day long.

prisenco•6mo ago

I use LLMs as a conversation partner but never let it write the code for me and this feels like I get the good parts without the bad.

kaffekaka•6mo ago

I agree and this is a risk. There is a balance to be found where the AI is power tool, not a black box.

The best use cases for me have been when I know the solution and what it should look like but don't want to write it all by hand myself. The AI can make the code just appear and I can verify it is up to spec.

hamandcheese•6mo ago

If you review some shit code and think of a better way to do it on the spot, is that original thought?

Does the answer change if the author is an LLM as opposed to another person?

Climb down from your high horse why don't you.

Scarblac•6mo ago

But then you still haven't started.

crinkly•6mo ago

People pay me to do quality work not look busy. Thinking happens before you start.

scrollaway•6mo ago

People most likely pay you to do valuable work, not quality work. One doesn’t necessarily mean the other.

If your coworkers can get into the groove faster than you (for whatever reason), you’ll continue to do quality work, not look busy, and no longer be paid for it.

crinkly•6mo ago

Quality is the value here. Because low quality has a financial risk.

And good luck finding anyone these days who can do quality.

It's a nice niche.

malthaus•6mo ago

it sounds like they're buying your confidence primarily

pc86•6mo ago

No you don't understand, this person writes quality code in an intellectually honest way because they use machines in a very specific way, just enough but not too much. Everyone else is just slinging shitcode. Good luck finding someone as good as this random person online you've never heard of!

I don't mean to attack this person specifically but it's a frankly pretty bad mindset that is far too prevalent in our profession - the way I do things in the Right Way, I write Good Code, very few other people write Good Code, and because I do things the way they were done 5, 10, 25 years ago means that my way is better than yours.

crinkly•6mo ago

There's more than just me. We keep each other honest.

ruszki•6mo ago

Looking at my salary, which is at least twice as much as my coworkers in the past 15 years at various companies, because I ship quality solutions, companies seem to think that quality is quite valuable.

swader999•6mo ago

Going for a walk, taking a nap, stepping away from the computer is a programming super power. Don't ever think it's a bad thing.

danielbln•6mo ago

It's the white page problem, that LLMs solve handily. Instead of facing the daunting task of building back up a complex mental model, I can ask the machine "yo, what were we doing? what's this thing? ok, how about you give it a spin and show me" and all of the sudden I'm back in it and can get cracking. That, and rubber ducking, and speeding up yak shaving to the nth degree really make this thing very useful to me. I also have various data sources hooked up (slack, notion, linear) so it's also a task management and project management tool for me.

storus•6mo ago

Maybe there is a reason why is your brain hesitating (e.g. low energy) and pushing it with LLMs would blow it up at some point in the future if the cause was not fixed in the meantime? Covid showed me (in a very accentuated way) that many of the times I was procrastinating was instead brain throttling due to low energy, protecting itself.

extr•6mo ago

I think for me the hesitation is in that to get started on task XYZ, the first step is often some extremely boring task ABC that is only tangentially related to the "real" task.

For example, the other day I needed to find which hook to use in a 2300 loc file of hooks. AI found me the hook in 5 seconds and showed me how to use it. This is a pure win - now that I have the name of the hook I can go read it in 30 seconds and verify it's what I wanted. If it's not, I can ask again. There's zero risk here.

jorvi•6mo ago

Re: the "white page problem", that is just known as motivation inertia. You can overcome it by training yourself to agree with yourself to "just do it" for 5 minutes, even if you really, really don't want to. If you're still demotivated after that, fine. But 9 out of 10 times, once you've started, it's surprisingly easy to keep the momentum up. This is also great for going to the gym or cleaning up your home, where an LLM can't come to the rescue :)

Re: usage of LLMs, that is honestly the way I like to use LLMs. That and auto-complete.

It's great for creating a bird's-eye view of a project because that is very fuzzy and no granular details are needed yet. And it's great at being fancy autocomplete, with its stochastic bones. But the middle part where all the complexity and edge cases are is where LLMs still fail a lot. I shudder for the teams that have to PR review devs that jubilantly declare they have "5x'ed" their output with LLMs, senior or not.

What is even more worrisome is that the brain is a muscle. We have to exercise it with thinking, hence why puzzlers stay sharp at old age. The more you outsource your code (or creative writing) thinking, the worse you get at it, and the more your brain atrophies. You're already seeing it with Claude Code, where devs panic when they hit the limit because they just might have to code unassisted.

jvanderbot•6mo ago

At the end of the day when I'd be thrashing or reverting as much as I write, I can let Jesus take the wheel for a bit to refresh.

On small issues it's usually just a glance over the diff, on harder issues it's not challenging to scold it into a good path once something starts coming together, as long as it's a localized edit with a problem I understand. I'll often take over when it's 40-60% there, which is only possible b/c it does a good job of following todo lists.

I've moved into a good daily workflow that emphasizes my own crafting during the times I'm sharp like mornings, and lets AI do the overtime and grunt work while I prepare for tomorrow or do more thoughtful writing and design.

pc86•6mo ago

So glad I've found someone else who has this workflow.

I spend probably 70% of my active coding time coding. The rest is using LLMs to keep making progress but give my brain a bit of a break while I recharge, being more supervisory with what's happening, and then taking over if necessary or after the 10-15 minutes it takes for me to start caring again.

muzani•6mo ago

Even if I want to do it myself, I'll ask it to write it a plan and toss it in a markdown file.

It gave me a shit plan today. Basically I asked it to refactor a block of prototype code that consisted of 40 files or so. It went the wrong way and tried to demolish it from the bottom up instead of making it 100% backwards compatible. If it made a mistake we would have taken forever to debug it.

But yeah, it gave me something to attack. And the plan was fixed within an hour. If I tried to make one myself, I would have frozen from the complexity or gone in circles documenting it.

kingofheroes•6mo ago

I've had great success using AI to translate my thoughts into actual words on a page (not just for software development, but also amateur writing). Bridging the vocabulary gap.

ModernMech•6mo ago

I thought this as well, but I just got burned in a way that caused me to lose so much time: I was feeling lazy and allowed AI to write some code for me. It looked good and compiled so I committed and pushed it. Weeks later I experienced a crash that I couldn't pinpoint. After hours of debugging, I eventually realized the culprit was that bit of code created by AI from weeks ago; it used a function that was defined in my code in a way that wasn't intended, causing a panic sometimes and weird off-by-one errors other times. The offending function was only a few lines of code. Had I not been lazy and just took 10 minutes to write it, I would have saved myself a whole afternoon of frustrating debugging.

Lesson learned: being lazy with AI has a hidden cost.

hamandcheese•6mo ago

Was that being lazy with AI, or being lazy with code review? That doesn't strike me as any different than rubber stamping a colleagues code.

slashdev•6mo ago

Yes, it helps a lot overcome that initial momentum. I also find it helps at the end of the day when I’m tired. I can just tell it to do the work a little bit at a time and review/fix the output. Takes less out of me.

dyl000•6mo ago

using claude code is absolutely incredible sometimes and absolute junk other times. great for easy to mid complexity refactors. ask it to do anything decently complex and it explodes. It's taken the place of a junior dev for me now. well worth the $200/mo.

perrin_veronica•6mo ago

With Cursor and CC Max both throttling and changing terms recently, I worry that the trend has started to just charge and throttle more and more like a drug dealer, until we’re all stealing and living on the floor in some abandoned building.

dejavucoder•6mo ago

lol yeah

ianberdin•6mo ago

Reading all these glowing reviews of Claude Code, I still get the feeling that either everyone’s been paid off or it’s just the die-hard fans of terminal windows and editors like Emacs and Vim. Using the terminal is right up their alley—it’s in their DNA.

Every time I read comments saying Claude Code is far better than Cursor, I fire it up, pay for a subscription, and run it on a large, complex TypeScript codebase. First, the whole process takes a hell of a lot of time. Second, the learning curve is steep: you have to work through the terminal and type commands.

And the outcome is exactly the same as with the Claude that’s built into Cursor—only slower, less clear, and the generated code is harder to review afterward. I don’t know… At this point my only impression is that all those influencers in the comments are either sponsored, or they’ve already shelled out their $200 and are now defending their choice. Or they simply haven’t used Cursor enough to figure out how to get the most out of it.

I still can’t see any real advantage to Claude Code, other than supposedly higher limits. I don’t get it. I’ve already paid for Claude Code, and I’m also paying for Cursor Pro, which is another $200, but I’m more productive with Cursor so far.

I’ve been programming for 18 years, write a ton of code every single day, and I can say Cursor gives me more. I switch between Gemini 2.5 Pro—when I need to handle tasks with a big, long context—and Claude 4.0 for routine stuff.

So no one has convinced me yet, and I haven’t seen any other benefit. Maybe later… I don’t know.

james_marks•6mo ago

Some of it is taste. Personally, I dislike VSCode and its descendants; it feels bloated and like I’m always being advertised to inside the product. I want my editor to disappear from thought, and Sublime does this for me.

In that context, CC is amazing.

Myrmornis•6mo ago

A lot of people don't realize that you can completely change the vscode UI. All the side bars in particular - just get rid of them. You're left with a blank screen on which syntax-highlighted code appears with LSP and either the command palette or if you want to be more vim/emacs-y then basically every operation is key-bindable.

pxc•6mo ago

I assume that on subsequent runs, it's less offensive, but every time I start a blank VSCode setup, I find myself repulsed by a series of pop-ups, persistent tooltips, extra tabs, etc., that are at best uninteresting and unwarranted. It definitely has strong Windows-y my-computing-environment-is-an-obstacle-course vibe.

Myrmornis•6mo ago

I think if you can't be bothered to configure a text editor then the genre isn't really for you.

moffkalast•6mo ago

That's funny I always considered Sublime to be more in my face, begging for money every time I opened it.

midasz•6mo ago

There's a neat trick to remove the nag screen

jonwinstanley•6mo ago

That’s one purchase I never regretted. Sublime was so perfect before writing code got tipped upside down

moffkalast•6mo ago

Yeah installing vscode, quite the neat trick indeed.

muzani•6mo ago

From what I see, you have to get your claude.md instructions right, and most of it is using planning mode. The CLI is exactly how you interact with it. It's less vibe code and what I'd call supervisory code.

It's less about writing code, more about building an engine that codes. It's hands off but unlike Cursor, it's not as eyes-off. You do observe and interrupt it. It is great for full on TDF - it figures out the feature, writes the test for the feature, fails the test, rewrites the code to pass. But at some point, you realize you got the instructions wrong and because you're outside the loop, you have to interrupt it.

I think it's a natural progression, but it's not for everyone. The people who hated not being able to write functions will hate claude code even more. But some people love writing their engines.

swader999•6mo ago

I'm just full of joy building again. It's hard being a manager, kids, jaded with the front end framework dance. Now I can just build build build and get my ideas out to users faster than they can decide.

cco•6mo ago

Cursor does the exact same thing? Just in a GUI where you can see it all happen.

Adifounder•6mo ago

does it? i feel like you have a lot more command with claude code, they should have ui though i agree

nico•6mo ago

I used to feel similarly. But recently started using Claude Code and it does feel a lot better than Cursor for me

I'm not sure why. However, Claude does seem to know better where things are and knows not to make unnecessary changes. I still need to guide it and tell it to do things differently some times, but it feels like it's a lot more effective

Personally, I also like that usually it just presents to me only one change/file at a time, so it's easier for me to review. Cursor might open several files at once, each with a tons of changes, which makes it a lot harder for me to understand quickly

Btw, I use Claude Code in a terminal pane inside VSCode, with the extension. So Claude does open a file tab with proposed changes

ianberdin•6mo ago

How can you prove you’re not a bot? I get the feeling I’ve seen a comment like this somewhere before.

Tell us what you do—how, under what conditions, and in which programming languages. What exactly makes it better? Does it just search for files? Well, that’s hardly objective.

So far the impression is… well, you say it only seems better. But you won’t get far with that kind of reasoning.

Objectively, it now seems to me that Claude the cat is better, because everyone around says it’s better. Yet no one has actually explained why. So the hype is inflated out of thin air, and there are no objective reasons for it.

Wowfunhappy•6mo ago

> How can you prove you’re not a bot? I get the feeling I’ve seen a comment like this somewhere before.

...you're replying to an account that was created in 2007. A bio is listed on the profile page.

Maybe you feel you've seen this comment before because it's an opinion lots of people share? Even if you do not.

mrtesthah•6mo ago

There isn’t some sort of conspiracy here, man. Anthropic gets paid either way.

Wowfunhappy•6mo ago

> Personally, I also like that usually it just presents to me only one change/file at a time, so it's easier for me to review

This is interesting. I haven't used Cursor, but one of my frustrations with Claude Code is that some of the individual changes it asks me to approve are too small for me to make a decision. There are cases where I almost denied a change initially, then realized Claude's approach made sense once I saw the full change set. Conversely, there are cases where I definitely should have stopped Claude earlier.

It doesn't help that Claude usually describes its changes after it has made the full series, instead of before.

...really, what I'd like is an easy way to go back in time, wherein going back to an earlier point in the conversation also reverted the state of the code. I can and do simulate this with git to some extent, but I'd prefer it as a layer on top of git. I want to use git to track other things.

nico•6mo ago

> some of the individual changes it asks me to approve are too small for me to make a decision. There are cases where I almost denied a change initially, then realized Claude's approach made sense once I saw the full change set

Yes, I've adapted to just review quickly, then if it makes sense as part of the task, let it keep going until it's done with the whole thing. Most of the times, by the end it does the right thing

I love that it doesn't auto-commit everything, ala aider, so it's pretty painless to undo stuff

I also keep a TODO.md file with a plan of everything I want to do for the current ticket/PR. I tell CC to keep track of things there. CC takes stuff from there, breaks it down into its own subset of tasks and when finished I tell it to update the TODO.md with the progress. I also tell it to stage files and create commits

The way I use it, it feels like I'm still programming, but I don't need to write code or run commands by myself, nor get stuck googling stuff up. I can just tell CC to do almost anything for me. It takes away the tediousness of what I want to accomplish

Wowfunhappy•6mo ago

> I love that it doesn't auto-commit everything, ala aider, so it's pretty painless to undo stuff.

Yeah, I'm definitely glad it doesn't commit for me. The main issue I have is that I'm never sure how granular to make my commits. Sometimes I make them very granular because I'm experimenting with Claude and I want to be able to revert to any point in the conversation—but now I have to write a message each time to keep track of which is which. Conversely, when I don't make the commits as granular I loose the ability to roll back, and sometimes regret it.

Also, sometimes Claude gets a bit too smart! Let's say I decide I want Claude to try again with a slightly different prompt. I save my current changes in a branch, roll back, to the previous state, and ask Claude to try again. Sometimes Claude will say "I see this is already implemented in the XX branch. Let me continue to build on that implementation."

nico•6mo ago

Yeah, sometimes I love it when it checks the git log and properly finds reference code for what I’m trying to implement, but other times I really want it to just do it differently and can get annoying

Other times I’ll tell it about an issue that I want to solve and it will come up with a solution I don’t want. I’ll tell it to take a different approach and it will listen for a bit, then all of a sudden just try to go back to its first approach and I need to steer it again, multiple times even

> but now I have to write a message each time to keep track of which is which

I ask it to write my commits. Usually it’s also pretty smart about which files to include based on the most recently addressed tasks. I have it run git add and git commit under my supervision

A repo Im working on, has some rather annoying hooks that check linting when committing (and aggressively modify the files to fix formatting). If I forget to manually check before committing, then I end up with a “wrong” commit containing the incorrectly formatted files, and a bunch of uncommitted files with the formatting changes. CC most of the times will see the error messages, and automatically propose the proper commands to run for undoing or amending the commit to include the formatting changes

slashdev•6mo ago

My commit messages have always been crap, so this doesn’t bother me much. I squash it all at the end anyway.

Still sometimes I forget to commit between tasks or it goes wild fixing compiler errors I didn’t want to address.

beyang•6mo ago

If you like Claude Code but either (1) prefer an agent that doesn't ask for review on each file edit or (2) miss the IDE for things like reviewing diffs, I'd humbly submit you try out Amp: https://ampcode.com. It has both a CLI and VS Code extension, and we built it from the ground up for agentic coding, so no asking for permission on each edit, a first-class editor extension (personally I spend more and more time reviewing diffs and VS Code's diff view is great), and it employs subagents for codebase search and extended thinking (using a combo of Sonnet and o3) to maximize use of the context window.

nico•6mo ago

Thank you for the suggestion. Do you guys also have subscription plans? Or do I need to pay separately for the models/apis I use?

varispeed•6mo ago

I rarely had Cursor do unwanted changes for me. Maybe this is about prompting? I am very particular with what I want and I try to be as verbose as I can and explain the context as well plus hint which files I believe it should put attention to.

ianberdin•6mo ago

And so you don’t think I just dropped in to try it out: no, I’ve been using Cursor every day for many months. I set up the rules very precisely, make maps, write instructions and constraints so the LLM understands how to work with the current codebase—where to find things, what to do, what not to do, what things should look like. I try to write as concisely as possible so everything fits in the context window.

I have a monorepo, so different instructions and rules live in different corners of it, which I manually add as needed. Doing all this with Claude is hard: it kind of works, but it’s a kludge. It’s much easier to manage through the UI. As the saying goes, it’s better to see something once than to remember it ten times, type a command, and configure it.

I can say the same about Vim and Emacs users. No one has yet proven to me they can code and navigate faster than an ordinary programmer using a trackpad or a mouse and keyboard. It’s purely a matter of taste. I’ve noticed that people who love those tools are just bored at work and want to entertain their brains. That’s neither good nor bad, but it doesn’t give a real boost.

By the way, there’s a study (a bit old, admittedly) showing that working with an LLM still doesn’t provide a massive speed-up in programming. Yes, it spits out code quickly, but assembling a finished result takes much longer: you have to review, polish, start over, throw things out, start again. It’s not that simple.

Speaking of Claude, he’s a real master at shitting in the code. At the speed of sound he generates a ton of unnecessary code even when you ask him not to. You ask for something minimalist, nothing extra—he still slaps on a pile of code with useless tests that outsmart themselves and don’t work.

That’s the downside. Overall, it’s easier to program when you just say what to do instead of sitting there straining your own brain. But even then it’s not simple. If you yourself haven’t figured out very carefully what you’re doing, you’ll end up with crap. I almost always throw away what I wrote with an LLM when I was tired. I don’t think there’s ever been an exception.

The only way to write something decent is to dig in thoroughly: draw diagrams, write everything out, and only then tell the LLM exactly what to do. Then double-check, make sure it didn’t write anything extra, delete the excess. And only like that, in small chunks. The “make me a feature, I pressed a button and walked away” mode doesn’t work if the task is even slightly complex.

Yeah, I got carried away—had to let it out. I have a lot to say about LLMs and programming with them, and not just about Cursor and Claude. The tools completely upended the way we program...

swader999•6mo ago

I remember a few projects before all the AI where I would setup templates, little code generators, patterns to follow. Just make it really easy to do the right things. Was so easy to get things done. Was a small team, like minded, easy to row in the same direction. Your post reminded me of that.

I too notice this about Claude, I've written commands, massaged Claude.md, even hooks. Given it very precise feature guides. I see the same issues you do. Feels like Claude has a lazy lever built in to it.

I play GPT O3 off it constantly, it takes the spec, a repomix slice of the Claude work and gives a new gap.md for Claude to pursue. Rinse and repeat. It works but your cursor flow seems better.

My highest velocity was about 1.6 fib complexity points a day over thirty years, now it's 4.3 with Claude the last three weeks which is nuts. I'm a total hack, I think I could get it up to nine if I was a bit more organized. Probably won't have to, just wait for the next iteration of the Claude models.

oblio•6mo ago

> My highest velocity was about 1.6 fib complexity points a day over thirty years, now it's 4.3 with Claude the last three weeks which is nuts.

Where do you work that you managed to track what I assume are Agile story points over 30 years, also, so accurately?

I don't even remember my median number for a sprint, let alone my daily average over long periods of time.

swader999•6mo ago

Yeah, accurately since 2008. Before that was just a guess but it wasn't at all fast. I've worked a lot of places but always insisted on story points based on estimated dev complexity.

Waterluvian•6mo ago

I think it’s because a lot of people deeply misunderstand what the hard parts of software development actually are. Most of the time, programs won’t involve novel or complicated algorithms or whatnot. It’s mostly just gluing ideas together. But that all comes after the specification, design, architecture, planning, etc.

I think it can also be deceptive the same way it can be in basically any trade: it’s easy for the job to get “done” and work. It’s hard to do it properly in a way that it has a lasting durability. With A.I. we can spit out these programs that work and look done. And for prototypes or throwaways and such, that’s probably wonderful! But that’s not going to fly if you’re building a house to spec for people who are going to live in it for 30 years.

javier2•6mo ago

Yeah, writing code has been near trivial since I started working 12 years ago. Used Intellij since 2012. The difficult part was always reading old code and figure out the boundaries where backward comparibility breaks and figure out how to execute a rollout safely.

LeafItAlone•6mo ago

>it’s easy for the job to get “done” and work. It’s hard to do it properly in a way that it has a lasting durability. With A.I. we can spit out these programs that work and look done. And for prototypes or throwaways and such, that’s probably wonderful! But that’s not going to fly if you’re building a house to spec for people who are going to live in it for 30 years.

Let’s be honest, that’s what most companies pay good salaries for most software developers for.

When staring at a new job or project, do you more find yourself praising the architecture and quality? Or wondering how it got to this point?

Waterluvian•6mo ago

Moving fast isn’t necessarily always the wrong choice. Often it’s the right choice… for a time…

Maybe AI will disrupt the rush to market phase. Maybe that makes complete sense, tbh. But there’s a whole realm of sober engineering that still needs to be done properly.

ActionHank•6mo ago

What people haven't realised yet is that Cursor isn't a product, it's a collection of features that every product is feverishly working to add.

The key take away here is that save for the deep integration, there is a strategy for working with a best in class agent solution agnostic of tooling.

These learnings will eventually coalesce into "best practices" that people will apply using the editor or IDE of their choice and all these vscode forks will have died off.

WorldMaker•6mo ago

Arguably this is why I'm still long on GitHub Copilot. Love them or hate them Microsoft/GitHub are proven winners in feature parity work and have such a big moat today. Why use a VS Code fork when you can "use the real thing"?

Microsoft/GitHub has even signaled confidence that they can compete feature by feature with the forks by even open sourcing much of the Copilot integration in VS Code.

Aesthetically, I like VS Code Copilot/Copilot Chat UIs/UX for the most part, certainly better than I like Claude Code and more than I like Cursor, too.

rgreeko42•6mo ago

I use Intellij Idea and I don't want to learn a new IDE. Claude Code is an easy TUI, unlike Vim or Emacs. It's a simpler solution to me.

aschobel•6mo ago

My biggest piece of starting advice would be use it for exploration not generation. It can be great for understanding code bases and looking at git history.

Context is key, so it is also really helpful having a CLAUDE.md file; the /init command can create one for you.

If you are going to use it for generation (writing code); plan mode is a must. Hit shift-tab twice.

Finally; I'm mostly using the claude 4 sonnet model, but make sure to tell it to "ultrathink" (use those words); this means it can use more thinking tokens.

jonwinstanley•6mo ago

Cursor and code completion is such a time saver for me. Having the next few lines guessed feels magic.

Having an agent completely design/write a feature still feels weird. I like everything being hand written, even if I actually just tab completed and edited it slightly

SteveJS•6mo ago

I used it for 2 weeks with the cheap $17/mo sub. It is equal parts amazing, and frustrating.

I ended up with 8k lines of rust and 12k lines of markdown. I think those markdown designs and explicit tasks were required the same way unit tests with a test harness are required to make the human-tool interaction work.

However, I’m not sure if the ‘magic’ is VC-subsidy or something else.

It did make rust (a language i do not know) feel like a scripting language. … the github repo is ‘knowseams’.

AndrewKemendo•6mo ago

Is your complaint about specifically claude or about GPT assistants generall?

I’m curious what this means:

> run it on a large, complex TypeScript codebase

What do you mean by “run it?”

Are you putting the entire codebase all at once into the context window? Are you giving context and structure prompts or a system architecture first?

Most of the people I see fail to “get” GPT assistants is because they don’t give context and general step-by-step instructions to it.

If you treat it like a really advanced rubber duck it’s straight up magic but you still have to be the senior engineer guiding the project or task.

You can’t just dump a 10,000 LOC file into it ask some vague questions and expect to get anything of value out of it.

jcelerier•6mo ago

> Second, the learning curve is steep: you have to work through the terminal and type commands.

that's supposed to be part of CS101

phito•6mo ago

I just feel sad for all the professional developers out there that are still scared of the terminal.

mountainriver•6mo ago

it's just a worse interface for writing code, no one is scared of anything

corytheboyd•6mo ago

I’m very effective with JetBrains IDEs, VSCode is not in the same league. I’m very effective with a terminal. Claude Code is just enough of a bridge between models and my IDE to enhance my workflow without ruining my IDE. That’s it. I don’t pay $200.

93po•6mo ago

i hate terminal stuff and refuse to write code in terminals, and i still think claude code has better results than cursor IDE with sonnet 4. it might be a mixture of it mixing in opus (which i never really used with cursor) or it might be that there's a bigger context window or its agentic process is somewhat different. im not entirely sure. but i feel like it gets things right more consistently and introduces fewer issues. it might also be the use of CLAUDE.md which i think has helped a ton in consistently making sure its following best practices for my project

tom_m•6mo ago

Die hard fans. I tried so many Claude models. Some are ok. Really. Surprisingly some versions of Sonnet are better than Opus. So the latest isn't always the greatest. Models are specialized sometimes (intentionally designed that way or not).

I think people just need to try a bunch to be honest. I think that we will get to a point (or are already there) where some models just resonate better with you and your communication (prompt) style.

But after using a bunch...I have to say that Gemini 2.5 Pro is consistently the best one I've used yet. It's pricey, but it just works.

I'm still looking for a good local model, but it's just going to be a while until anything local can match Gemini. It will eventually I think, but it'll take some time.

I probably won't be using Claude Code, but I'm glad it works for some folks.

mitchitized•6mo ago

Each use case, which for many is a project-by-project thing, likely determines the right tool for the job.

For new projects, I find Claude Code extremely helpful, as I start out with a business document, a top-level requirements document, and go from there. In the end (and with not a lot of work or time spent) I have a README, implementation plan, high-level architecture, milestones, and oftentimes a swagger spec, pipeline setup and a test harness.

IMHO pointing CC at a folder of a big typescript project is going to waste a ton of compute and tokens, for minimal value. That is not a good use of this tool. I also have a pretty strong opinion that a large, complex typescript codebase is a bad idea for humans too.

Point CC at a python or go repo and it is a whole 'nother experience. Also, starting out is where CC really shines as stated above.

For a big complex typescript repo I would want very specific, targeted help as opposed to agentic big-picture stuff. But that also minimizes the very reason I'd be reaching for help in the first place.

ak217•6mo ago

The key to productivity with these tools (as with most things) is a tight feedback loop.

Cursor has an amazing tab completion model. It uses a bunch of heuristics to try to understand what you're doing and accelerate it. It's a bit like a very advanced macro engine, except I don't have to twist my brain into a pretzel trying to program my editor - it just works, and when it doesn't I press Esc, which takes about half a second. Or I reach for the agentic mode, where the iteration cycle is more like 30 seconds to a minute.

With the fully agentic editors, it takes more like 15 to 30 minutes, and now it's a full on interruption in my flow. Reviewing the agent's output is a chore. It takes so much more mental bandwidth than the quick accept/reject cycle in the editor. And I have to choose between giving it network access (huge security hazard) or keeping it offline (crippling it in most codebases).

I find that for throwaway codebases where I don't care about maintenance/security/reliability, I can put up with this. It's also incredibly useful when working with a language or framework that I don't fully understand (but the model understands better). But for most of my work it produces poor results and ends up being a net negative for productivity - for now.

I'm sure this will improve, but for now I consistently get better results in Cursor.

chis•6mo ago

So far my feeling is that cursor is a bit better for working on existing large codebases, whereas claude does a good job planning and building greenfield projects that require a lot of code.

That does net out to meaning that Cursor gets used way more often, atm.

varispeed•6mo ago

Looks like I am not the one. I look at the Claude Code reviews and feel as if I had a stroke. What is it that I am not getting?

The whole thing looks like one large source of friction.

I've been using Cursor a lot and it speeds up everything for me. Not sure what kind of benefit Claude Code would give me.

eric_cc•6mo ago

I’m not a vim and eMacs guys - or terminal guy.

Claude Code + Cursor is my setup. Claude Code is my workhorse that I use for most tasks.

For small tiny quick changes, I use Cursor in auto.

For architecture consulting I use Cursor chat with Grok 4.

Claude Code is much better at banging out features than Cursor’s Claude. And Claude itself is better than other models as a coding workhorse.

All that said: while we may all code, we are building wildly different things that have much different specific problems to solve so it doesn’t surprise me that there isn’t a universal setup that works best for everybody.

WXLCKNO•6mo ago

What did you use before Grok 4 came out for architecture consulting, Gemini 2.5 Pro?

eric_cc•6mo ago

Yes 2.5 pro and o3. I’ve been very impressed with Grok 4 so far for this purpose. It’s very thorough and accurate. Granted my time with grok 4 is limited to a few days but I’ve already given it challenging problems and the results were exceptional.

solumunus•6mo ago

I would wager that you’re either not leveraging custom context files or you’re doing it very poorly.

mountainriver•6mo ago

You gotta hand it to anthropic for being able to build a demonstrably worse interface yet convince whole swaths of the eng community that its better.

I also suspect a lot of this is paid for or just terminal die hards.

awacs•6mo ago

I think you need to be a bit more explicit in your usage. I use Claude Code as the plugin in Cursor, to me the marriage of all the great, tastes great, less filling. You see all the changes as you normally would in VSCode / Cursor, but the Claude Code command line UI is absolutely bonkers better than the Cursor native one. They really have it dialed in beautifully in my opinion.

deorder•6mo ago

I am a software developer with over 25 years of professional experience and have been working with coding agents for quite some time starting with AutoGPT and now using Claude Code almost 24/7 orchestrated via Task Master to automatically spin up new instances working on a multi layer project.

You are absolutely right. A large portion are influencers (I would estimate around 95% of those you see on YouTube and forums) that are full of hype. I think most are not affiliated with Anthropic or any vendor, they are just trying to sell a course, ebook or some "get rich with AI" scheme.

What I appreciate about Claude Code:

- Since it is a terminal/CLI tool it can be run headlessly from cron jobs or scripts. This makes it easy to automate.

- I appreciate the predictable pricing model. A flat monthly fee gives me access to Claude Sonnet and Opus 4 in five-hour sessions each with its own usage limit that resets at the start of a new session. There is a fair use policy of around 50 sessions per month, but I haven’t hit that yet. I deliberately run only one instance at a time as I prefer to use it responsibly unlike some of the "vibe" influencers who seem to push it to the limit.

That's it. Despite being a CLI based tool, Claude Code is remarkably out of the box for what it offers.

That said, no coding agent I have encountered can fully ingest a large inconsistent legacy codebase especially one with mixed architectures that accumulated over years. This limitation is mainly due to context size constraints, but I expect this to improve as context windows grow.

jshen•6mo ago

Expecting everyone to change their editor/ide is a big hurdle, and cursor doesn't support everything well like iOS development. A tool that works independently from your editor of choice is much easier to adopt. It's also much easier to script and automate.

827a•6mo ago

IMO: The people who are obsessed with Claude Code are the people who weren't really engineers to begin with. For them, any level of productivity is "extremely impressive" and "going to kill engineering jobs" because they never really understood what software engineers do, and had no sense of what the demarcations on the scale of productivity even should be.

The reason why they like Claude Code specifically over Cursor isn't because they were fans of terminal windows; on the contrary, its because CC is simpler. Cursor is complicated, and its interface communicates to the user that eventually you might be grabbing the wheel. Of course, experienced engineers want to grab the wheel; working together with the AI toward some goal. But these CC-stans don't. All that extra stuff in the UI scares them, because they wouldn't know what to do with it if they had to do something with it.

Its this particular kind of person that's also all over /r/ClaudeAI or /r/Cursor complaining about rate limits. If you've ever used these products, you'd realize very quickly: The only way you're hitting rate limits so quickly on the paid plans is if the only code that's being outputted is from the AI, nonstop, for eight hours a day, using only the most expensive and intelligent models. The only people who do this are the people who have no brain of their own to contribute to the process. Most CC/Cursor users don't hit rate limits, because they're working with the AI, like a tool, not managing it like a direct report.

treyd•6mo ago

I don't care what anyone says about Cursor's workflow, I will never install anything based on VS Code, it's just a nonstarter. That's enough for me.

smoody07•6mo ago

I used Cursor daily for about six months before switching to CC and it took awhile. TUI is not my first choice. But their tools and prompts make it work fare more coherently, and I've had far fewer train wrecks in my code since fully switching a few weeks ago. The IDE extension helps.

Adifounder•6mo ago

i feel like it needs a ui, but then again that might kill the purpose of it, how should they go about it?

jonstewart•6mo ago

What is “branch-analysis.md” for in CC? Googling comes up short.

dejavucoder•6mo ago

It's just a you can tell claude to make to write notes to

swader999•6mo ago

My biggest problem at the moment is that I am more inclined to talk to Claude and GPT O3 than the other more junior devs on my team. I just get more done. I'm not being facetious here, I really don't like that about the way I've been the past few days. I have to use the new free time I have to work more with them I guess.

jmkni•6mo ago

I guess the inverse is true too though, junior devs are more inclined to ask AI than their superiors.

swader999•6mo ago

Except that they don't. Most can't afford it. My fault though, I'm about a month away from getting subscriptions for all of them and then I'll mob program with them and Claude until they either sink or swim. I'm in full on learning mode, still trying to prove it out and get a real validation of it all ..

Herring•6mo ago

Yeah I noticed the same. I would much rather have sensitive discussions (eg US foreign policy) with Gemini pro rather than with a human.

I was recently watching Seinfeld, and apparently in the 90s, picking up a friend from the airport was an important social contract. Today we just uber.

kierangill•6mo ago

Past the basic setup, what are people doing to grease the tracks for these tools?

That is, how are people organizing their context and codebase to help the tooling guide themselves to the right answer?

I have some half-baked thoughts here [1], but I know there are more (better) methodologies to be discovered.

[1]: https://blog.kierangill.xyz/oversight-and-guidance

clbrmbr•6mo ago

I've been getting a lot of alpha out of Claude Code, but struggling a bit with how to bring it to the rest of my team. Does anyone have any practical advice on sharing best practices or generally getting team members or folks you supervise to also benefit from Claude Code?

NewUser76312•6mo ago

I'm still not sold on these "AIs in your codebase" tools. Same for 'vibe coding'.

I use AI like a scalpel - I go in and figure out exactly what I want, use my prompting experience to get a few functions, or up to 1 script at a time, read it, put it in, and stress-test it immediately.

As a result, I am extremely happy with my AI-generated code about 95% of the time. However, this requires that I still understand my codebase, requirements, failure modes, etc.

My philosophy is that software dev creation speed is certainly important, but durability is even more critical. Currently, it seems like going too deep with these AI tools is sacrificing the former for the latter. If all you're doing is POCs all day I could see it (even then, don't let them get too big...) but it's just not for me, at least not yet.

_sharp•6mo ago

You should consider documenting your durability requirements, failure modes, important context related to the feature/component your working on, as you would a junior engineer. Then, you can re-use the context documents. If the AI doesn't know what your expectations are, it will just assume you need a PoC, and give you PoC code quality.

williamsss•6mo ago

As a daily active user of Cursor and CC user myself, it’s gratifying to discover nuanced tricks that push productivity further. I stumble upon one every now and again.

Solid write up. And chock full of useful tricks!

I was manually copy pasting PR change requests today to CC where some of this would have saved my wrists some pain.

actinium226•6mo ago

I don't see any actual work getting done here. Just one part where it added reply_to_email to a bunch of classes and removed it from a couple places, which doesn't seem like a big lift.

Adifounder•6mo ago

seems like claudes strength is being overhyped by the people who can actually make it work for them. why is the learning curve so hard.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

The Waymo World Model

How we made geo joins 400× faster with H3 indexes

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Show HN: I spent 4 years building a UI design tool with only the features I use

Dark Alley Mathematics

Microsoft open-sources LiteBox, a security-focused library OS

Show HN: If you lose your memory, how to regain access to your computer?

Sheldon Brown's Bicycle Technical Info

Hackers (1995) Animated Experience

Unseen Footage of Atari Battlezone Arcade Cabinet Production

An Update on Heroku

PC Floppy Copy Protection: Vault Prolok

Delimited Continuations vs. Lwt for Threads

Show HN: ARM64 Android Dev Kit

Why I Joined OpenAI

How to effectively write quality code with AI

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Introducing the Developer Knowledge API and MCP Server

Female Asian Elephant Calf Born at the Smithsonian National Zoo

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

Learning from context is harder than we thought

Understanding Neural Network, Visually

I now assume that all ads on Apple news are scams

FORTH? Really!?

Evaluating and mitigating the growing risk of LLM-discovered 0-days

WebView performance significantly slower than PWA

I'm going to cure my girlfriend's brain tumor

Show HN: Smooth CLI – Token-efficient browser for AI agents

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

The Waymo World Model

How we made geo joins 400× faster with H3 indexes

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Show HN: I spent 4 years building a UI design tool with only the features I use

Dark Alley Mathematics

Microsoft open-sources LiteBox, a security-focused library OS

Show HN: If you lose your memory, how to regain access to your computer?

Sheldon Brown's Bicycle Technical Info

Hackers (1995) Animated Experience

Unseen Footage of Atari Battlezone Arcade Cabinet Production

An Update on Heroku

PC Floppy Copy Protection: Vault Prolok

Delimited Continuations vs. Lwt for Threads

Show HN: ARM64 Android Dev Kit

Why I Joined OpenAI

How to effectively write quality code with AI

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Introducing the Developer Knowledge API and MCP Server

Female Asian Elephant Calf Born at the Smithsonian National Zoo

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

Learning from context is harder than we thought

Understanding Neural Network, Visually

I now assume that all ads on Apple news are scams

FORTH? Really!?

Evaluating and mitigating the growing risk of LLM-discovered 0-days

WebView performance significantly slower than PWA

I'm going to cure my girlfriend's brain tumor

Show HN: Smooth CLI – Token-efficient browser for AI agents

My experience with Claude Code after two weeks of adventures

Comments