I started by copy pasting more and more stuff in chatgpt. Then using more and more in-IDE prompting, then more and more agent tools (Claude etc). And suddenly I realise I barely hand code anymore
For sure there's still a place for manual coding, especially schemas/queries or other fiddly things where a tiny mistake gets amplified, but the vast majority of "basic work" is now just prompting, and honestly the code quality is _better_ that it was before, all kinds of refactors I didn't think about or couldn't be bothered with have almost automatically
And people still call them stochastic parrots
Both can be true. You're tapping into every line of code publicly available, and your day-to-day really isn't that unique. They're really good at this kind of work.
ChatGPT 3.5/4 (2023-2024): The chat interface was verbose and clunky and it was just... wrong... like 70+% of the time. Not worth using.
CoPilot autocomplete and Gitlab Duo and Junie (late 2024-early 2025): Wayyy too aggressive at guessing exactly what I wasn't doing and hijacked my tab complete when pre-LLM type-tetris autocomplete was just more reliable.
Copilot Edit/early Cursor (early 2025): Ok, I can sort of see uses here but god is picking the right files all the time such a pain as it really means I need to have figured out what I wanted to do in such detail already that what was even the point? Also the models at that time just quickly descended into incoherency after like three prompts, if it went off track good luck ever correcting it.
Copilot Agent mode / Cursor (late 2025): Ok, great, if the scope is narrowly scoped, and I'm either going to write the tests for it or it's refactoring existing code it could do something. Like something mechanical like the library has a migration where we need to replace the use of methods A/B/C and replace them with a different combination of X/Y/Z. great, it can do that. Or like CRUD controller #341. I mean, sure, if my boss is going to pay for it, but not life changing.
Zed Agent mode / Cursor agent mode / Claude code (early 2026): Finally something where I can like describe the architecture and requirements of a feature, let it code, review that code, give it written instructions on how to clean it up / refactor / missing tests, and iterate.
But that was like 2 years of "really it's better and revolutionary now" before it actually got there. Now maybe in some languages or problem domains, it was useful for people earlier but I can understand people who don't care about "but it works now" when they're hearing it for the sixth time.
And I mean, what one hand gives the other takes away. I have a decent amount of new work dealing with MRs from my coworkers where they just grabbed the requirements from a stakeholder, shoved it into Claude or Cursor and it passed the existing tests and it's shipped without much understanding. When they wrote them themselves, they tested it more and were more prepared to support it in production...
It's such a visual and experiential thing that writing true success criteria it can iterate on seems like borderline impossible ahead of time.
Or slower, when the LLM doesn't understand what I want, which is a bigger issue when you spawn experiments from scratch (and have given limited context around what you are about to do).
Any qualified guesses?
I'm not convinced more traders on wall street will allocate capital more effectively leading to economic growth.
Will more programmers grow the economy? Or should we get real jobs ;)
Like, do these guys actually dog food real user experience, or are they all admins with the fast lane to the real model while everyone outside the org has to go through the 10 layers of model sheding, caching and other means and methods of saving money.
We all know these models are expensive as fuck to run and these companies are degrading service, A+B testing, and the rest. Do they actually ponder these things directly?
Just always seems like people are on drugs when they talk about the capabilities, and like, the drugs could be pure shit (good) or ditch weed, and we call just act like the pipeline for drugs is a consistent thing but it's really not, not at this stage where they're all burning cash through infrastructure. Definitely, like drug dealers, you know they're cutting the good stuff with low cost cached gibberish.
The underlying models are all actually really undifferentiated under the covers except for the post-training and base prompts. If you eliminate the base prompts the models behave near identically.
A conspiracy would be a helluva lot more interesting and fun, but I've spoken to these folks firsthand and it seems they already have enough challenges keeping the beast running.
Can confirm. My partner's chatGPT wouldnt return anything useful for her given a specific query involving web use, while i got the desired result sitting side by side. She contacted support and they said nothing they can do about it, her account is in an A/B test group without some features removed. I imagine this saves them considerable resources despite still billing customers for them.
how much this is occurring is anyones guess
I’ve always said I’m a builder even though I’ve also enjoyed programming (but for an outcome, never for the sake of the code)
This perfectly sums up what I’ve been observing between people like me (builders) who are ecstatic about this new world and programmers who talk about the craft of programming, sometimes butting heads.
One viewpoint isn’t necessarily more valid, just a difference of wiring.
Managers and project managers are valuable roles and have important skill sets. But there's really very little connection with the role of software development that used to exist.
It's a bit odd to me to include both of these roles under a single label of "builders", as they have so little in common.
EDIT: this goes into more detail about how coding (and soon other kinds of knowledge work) is just a management task now: https://www.oneusefulthing.org/p/management-as-ai-superpower...
"I got into programming because I like programming, not whatever this is..."
Yes, I'm building stupid things faster, but I didn't get into programming because I wanted to build tons of things. I got into it for the thrill of defining a problem in terms of data structures and instructions a computer could understand, entering those instructions into the computer, and then watching victoriously while those instructions were executed.
If I was intellectually excited about telling something to do this for me, I'd have gotten into management.
This distinction to me separates the two primary camps
What I mean by that: you had compiled vs interpreted languages, you had types vs untyped, testing strategies, all that, at least in some part, was a conversation about the tradeoffs between moving fast/shipping and maintainability.
But it isn't just tech, it is also in methodologies and the words use, from "build fast and break things" and "yagni" to "design patterns" and "abstractions"
As you say, it is a different viewpoint... but my biggest concern with where are as industry is that these are not just "equally valid" viewpoints of how to build software... it is quite literally different stages of software, that, AFAICT, pretty much all successful software has to go through.
Much of my career has been spent in teams at companies with products that are undergoing the transition from "hip app built by scrappy team" to "profitable, reliable software" and it is painful. Going from something where you have 5 people who know all the ins and outs and can fix serious bugs or ship features in a few days to something that has easy clean boundaries to scale to 100 engineers of a wide range of familiarities with the tech, the problem domain, skill levels, and opinions is just really hard. I am not convinced yet that AI will solve the problem, and I am also unsure it doesn't risk making it worse (at least in the short term)
I deliberately avoid full vibe coding since I think doing so will rust my skills as a programmer. It also really doesn’t save much time in my experience. Once I have a design in mind, implementation is not the hard part.
I had felt like this and still do but man, at some point, I feel like the management churn feels real & I just feel suffering from a new problem.
Suppose, I actually end up having services literally deployed from a single prompt nothing else. Earlier I used to have AI write code but I was interested in the deployment and everything around it, now there are services which do that really neatly for you (I also really didn't give into the agent hype and mostly used browsers LLM)
Like on one hand you feel more free to build projects but the whole joy of project completely got reduced.
I mean, I guess I am one of the junior dev's so to me AI writing code on topics I didn't know/prototyping felt awesome.
I mean I was still involved in say copy pasting or looking at the code it generates. Seeing the errors and sometimes trying things out myself. If AI is doing all that too, idk
For some reason, recently I have been disinterested in AI. I have used it quite a lot for prototyping but I feel like this complete out of the loop programming just very off to me with recent services.
I also feel like there is this sense of if I buy for some AI thing, to maximally extract "value" out of it.
I guess the issue could be that I can have vague terms or have a very small text file as input (like just do X alternative in Y lang) and I am now unable to understand the architectural decisions and the overwhelmed-ness out of it.
Probably gonna take either spec-driven development where I clearly define the architecture or development where I saw something primagen do recently which is that the AI will only manipulate code of that particular function, (I am imagining it for a file as well) and somehow I feel like its something that I could enjoy more because right now it feels like I don't know what I have built at times.
When I prototype with single file projects using say browser for funsies/any idea. I get some idea of what the code kind of uses with its dependencies and functions names from start/end even if I didn't look at the middle
A bit of ramble I guess but the thing which kind of is making me feel this is that I was talking to somebody and shwocasing them some service where AI + server is there and they asked for something in a prompt and I wrote it. Then I let it do its job but I was also thinking how I would architect it (it was some detect food and then find BMR, and I was thinking first to use any api but then I thought that meh it might be hard, why not use AI vision models, okay what's the best, gemini seems good/cheap)
and I went to the coding thing to see what it did and it actually went even beyond by using the free tier of gemini (which I guess didn't end up working could be some rate limit of my own key but honestly it would've been the thing I would've tried too)
So like, I used to pride myself on the architectural decisions I make even if AI could write code faster but now that is taken away as well.
I really don't want to read AI code so much so honestly at this point, I might as well write code myself and learn hands on but I have a problem with build fast in public like attitude that I have & just not finding it fun.
I feel like I should do a more active job in my projects & I am really just figuring out what's the perfect way to use AI in such contexts & when to use how much.
Thoughts?
And accountability can still exist? Is the engineer that created or reviewed a Pull Request using Claude Code less accountable then one that used PICO?
The point is that in the human scenario, you can hold the human agents accountable. You cannot do that with AI. Of course, you as the orchestrator of agents will be accountable to someone, but you won't have the benefit of holding your "subordinates" accountable, which is what you do in a human team. IMO, this renders the whole situation vastly different (whether good or bad I'm not sure).
I expect interviews will evolve into "build project X with an LLM while we watch" and audit of agent specs
fun stats: corelation is real, people who were good at vibe code, also had offer(s) with other companies that didn't run vibe code interviews.
I have a professor who has researched auto generated code for decades and about six months ago he told me he didn't think AI would make humans obsolete but that it was like other incremental tools over the years and it would just make good coders even better than other coders. He also said it would probably come with its share of disappointments and never be fully autonomous. Some of what he said was a critique of AI and some of it was just pointing out that it's very difficult to have perfect code/specs.
Billionaire coder: a person who has "written" billion lines.
Ordinary coders : people with only couple of thousands to their git blame.
This is about where I'm at. I love pure claude code for code I don't care about, but for anything I'm working on with other people I need to audit the results - which I much prefer to do in an IDE.
After you tried it, come back.
I tried a website which offered the Opus model in their agentic workflow & I felt something different too I guess.
Currently trying out Kimi code (using their recent kimi 2.5) for the first time buying any AI product because got it for like 1.49$ per month. It does feel a bit less powerful than claude code but I feel like monetarily its worth it.
Y'know you have to like bargain with an AI model to reduce its pricing which I just felt really curious about. The psychology behind it feels fascinating because I think even as a frugal person, I already felt invested enough in the model and that became my sunk cost fallacy
Shame for me personally because they use it as a hook to get people using their tool and then charge next month 19$ (I mean really Cheaper than claude code for the most part but still comparative to 1.49$)
I've been working in the mobile space since 2009, though primarily as a designer and then product manager. I work in kinda a hybrid engineering/PM job now, and have never been a particularly strong programmer. I definitely wouldn't have thought I could make something with that polish, let alone in 3 months.
That code base is ~98% Claude code.
Not sure if it's an American pronunciation thing, but I had to stare at that long and hard to see the problem and even after seeing it couldn't think of how you could possibly spell the correct word otherwise.
I never paid any attention to different models, because they all felt roughly equal to me. But Opus 4.5 is really and truly different. It's not a qualitative difference, it's more like it just finally hit that quantitative edge that allows me to lean much more heavily on it for routine work.
I highly suggest trying it out, alongside a well-built coding agent like the one offered by Claude Code, Cursor, or OpenCode. I'm using it on a fairly complex monorepo and my impressions are much the same as Karpathy's.
If you're using plain vanilla chatgpt, you're woefully, woefully out of touch. Heck, even plain claude code is now outdated
Trying to incorporate it in existing codebases (esp when the end user is a support interaction or more away) is still folly, except for closely reviewed and/or non-business-logic modifications.
That said, it is quite impressive to set up a simple architecture, or just list the filenames, and tell some agents to go crazy to implement what you want the application to do. But once it crosses a certain complexity, I find you need to prompt closer and closer to the weeds to see real results. I imagine a non-technical prompter cannot proceed past a certain prototype fidelity threshold, let alone make meaningful contributions to a mature codebase via LLM without a human engineer to guide and review.
It's been especially helpful in explaining and understanding arcane bits of legacy code behavior my users ask about. I trigger Claude to examine the code and figure out how the feature works, then tell it to update the documentation accordingly.
After certain experience threshold of making things from scratch, “coding” (never particularly liked that term) has always been 99% building, or architecture, and I struggle to see how often a well-architected solution today, with modern high-level abstractions, requires so much code that you’d save significant time and effort by not having to just type, possibly with basic deterministic autocomplete, exactly what you mean (especially considering you would have to also spend time and effort reviewing whatever was typed for you if you used a non-deterministic autocomplete).
Asking it to do entire projects? Dumb. You end up with spaghetti, unless you hand-hold it to a point that you might as well be using my autocomplete method.
Quite insightful.
So I think this tracks with Karpathy's defense of IDEs still being necessary ?
Has anyone found it practical to forgo IDEs almost entirely?
It's not about the model. It's about the harness
Also note that with Claude models, Copilot might allocate a different number of thinking tokens compared to Claude Code.
Things may have changed now compared to when I tried it out, these tools are in constant flux. In general I've found that harnesses created by the model providers (OpenAI/Codex CLI, Antropic/Claude Code, Google/Gemini CLI) tend to be better than generalist harnesses.
But what I like about this setup is that I have almost all the context I need to review the work in a single PR. And I can go back and revisit the PR if I ever run into issues down the line. Plus you can run sessions in parallel if needed, although I don't do that too much.
The juniors though will radically have to upskill. The standard junior dev portfolio can be replicated by claude code in like three prompts
The game has changed and I don't think all the players are ready to handle it
Empowering people to do 10 times as much as they could before means they hit 100 times the roadblocks. Again, in a lot of ways we've already lived in that reality for the past many years. On a task-by-task basis programming today is already a lot easier than it was 20 years ago, and we just grew our desires and the amount of controls and process we apply. Problems arise faster than solutions. Growing our velocity means we're going to hit a lot more problems.
I'm not saying you're wrong, so much as saying, it's not the whole story and the only possibility. A lot of people today are kept out of programming just because they don't want to do that much on a computer all day, for instance. That isn't going to change. There's still going to be skills involved in being better than other people at getting the computers to do what you want.
Also on a long term basis we may find that while we can produce entry-level coders that are basically just proxies to the AI by the bucketful that it may become very difficult to advance in skills beyond that, and those who are already over the hurdle of having been forced to learn the hard way may end up with a very difficult to overcome moat around their skills, especially if the AIs plateau for any period of time. I am concerned that we are pulling up the ladder in a way the ladder has never been pulled up before.
I personally think the barrier is going to get higher, not lower. And we will be back expected to do more.
Anyone wondering what exactly is he actually building? What? Where?
> The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do.
I would LOVE to have jsut syntax errors produced by LLMs, "subtle conceptual errors that a slightly sloppy, hasty junior dev might do." are neither subtle nor slightly sloppy, they actually are serious and harmful, and no junior devs have no experience to fix those.
> They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?"
Why just not hand write 100 loc with the help of an LLM for tests, documentation and some autocomplete instead of making it write 1000 loc and then clean it up? Also very difficult to do, 1000 lines is a lot.
> Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day.
It's a computer program running in the cloud, what exactly did he expected?
> Speedups. It's not clear how to measure the "speedup" of LLM assistance.
See above
> 2) I can approach code that I couldn't work on before because of knowledge/skill issue. So certainly it's speedup, but it's possibly a lot more an expansion.
mmm not sure, if you don't have domain knowledge you could have an initial stubb at the problem, what when you need to iterate over it? You don't if you don't have domain knowledge on your own
> Fun. I didn't anticipate that with agents programming feels more fun because a lot of the fill in the blanks drudgery is removed and what remains is the creative part.
No it's not fun, eg LLMs produce uninteresting uis, mostly bloated with react/html
> Atrophy. I've already noticed that I am slowly starting to atrophy my ability to write code manually.
My bet is that sooner or later he will get back to coding by hand for periods of time to avoid that, like many others, the damage overreliance on these tools bring is serious.
> Largely due to all the little mostly syntactic details involved in programming, you can review code just fine even if you struggle to write it.
No programming it's not "syntactic details" the practice of programming it's everything but "syntactic details", one should learn how to program not the language X or Y
> What happens to the "10X engineer" - the ratio of productivity between the mean and the max engineer? It's quite possible that this grows a lot.
Yet no measurable econimic effects so far
> Armed with LLMs, do generalists increasingly outperform specialists? LLMs are a lot better at fill in the blanks (the micro) than grand strategy (the macro).
Did people with a smartphone outperformed photographers?
This is just a rambling tweet that has all the hallmarks of an AI addict.
Great idea! Le's pathalogize another thing! I love quickly othering whole concepts and putting them in my brain's "bad" box so I can feel superior.
https://github.com/karpathy/llm.c
The proof is in the pudding. Let's see your code
It does hurt, that's why all programmers now need an entrepreneurial mindset... you become if you use your skills + new AI power to build a business.
HN used to be a proper place for people actually curious about technology
Somewhere, there are GPUs/NPUs running hot. You send all the necessary data, including information that you would never otherwise share. And you most likely do not pay the actual costs. It might become cheaper or it might not, because reasoning is a sticking plaster on the accuracy problem. You and your business become dependent on this major gatekeeper. It may seem like a good trade-off today. However, the personal, professional, political and societal issues will become increasingly difficult to overlook.
Then even if you do catch it, AI: "ah, now I see exactly the problem. just insert a few more coins and I'll fix it for real this time, I promise!"
as the former, i've never felt _more ahead_ than now due to all of the latter succumbing to the llm hype
Like there have been multiple times now where I wanted the code to look a certain way, but it kept pulling back to the way it wanted to do things. Like if I had stated certain design goals recently it would adhere to them, but after a few iterations it would forget again and go back to its original approach, or mix the two, or whatever. Eventually it was easier just to quit fighting it and let it do things the way it wanted.
What I've seen is that after the initial dopamine rush of being able to do things that would have taken much longer manually, a few iterations of this kind of interaction has slowly led to a disillusionment of the whole project, as AI keeps pushing it in a direction I didn't want.
I think this is especially true if you're trying to experiment with new approaches to things. LLMs are, by definition, biased by what was in their training data. You can shock them out of it momentarily, whish is awesome for a few rounds, but over time the gravitational pull of what's already in their latent space becomes inescapable. (I picture it as working like a giant Sierpinski triangle).
I want to say the end result is very akin to doom scrolling. Doom tabbing? It's like, yeah I could be more creative with just a tad more effort, but the AI is already running and the bar to seeing what the AI will do next is so low, so....
Yea exactly, Like we are just waiting so that it gets completed and after it gets completed then what? We ask it to do new things again.
Just as how if we are doom scrolling, we watch something for a minute then scroll down and watch something new again.
The whole notion of progress feels completely fake with this. Somehow I guess I was in a bubble of time where I had always end up using AI in web browsers (just as when chatgpt 3 came) and my workflow didn't change because it was free but recently changed it when some new free services dropped.
"Doom-tabbing" or complete out of the loop AI agentic programming just feels really weird to me sucking the joy & I wouldn't even consider myself a guy particular interested in writing code as I had been using AI to write code for a long time.
I think the problem for me was that I always considered myself a computer tinker before coder. So when AI came for coding, my tinkering skills were given a boost (I could make projects of curiosity I couldn't earlier) but now with AI agents in this autonomous esque way, it has come for my tinkering & I do feel replaced or just feel like my ability of tinkering and my interests and my knowledge and my experience is just not taken up into account if AI agent will write the whole code in multi file structure, run commands and then deploy it straight to a website.
I mean my point is tinkering was an active hobby, now its becoming a passive hobby, doom-tinkering? I feel like I have caught up on the feeling a bit earlier with just vibe from my heart but is it just me who feels this or?
What could be a name for what I feel?
I'm still a little iffy on the agent swarm idea. I think I will need to see it in action in an interface that works for me. To me it feels like we are anthropomorphizing agents too much, and that results in this idea that we can put agents into roles and them combine them into useful teams. I can't help seeing all agents as the same automatons and I have trouble understanding why giving an agent with different guideliens to follow, and then having them follow along another agent would give me better results than just fixing the context in the first place. Either that or just working more on the code pipeline to spot issues early on - all the stuff we already test for.
Starcraft and Factorio are exactly what it is not. Starcraft has a loooot of micro involved at any level beyond medium, despite all the "pro macros and beats gold league with mass queens" meme videos. I guess it could be like Factorio if you're playing it by plugging together blueprint books from other people but I don't think that's how most people play.
At that level of abstraction, it's more like grand strategy if you're to compare it to any video game? You're controlling high level pushes and then the units "do stuff" and then you react to the results.
nadis•23h ago
> "IDEs/agent swarms/fallability. Both the "no need for IDE anymore" hype and the "agent swarm" hype is imo too much for right now. The models definitely still make mistakes and if you have any code you actually care about I would watch them like a hawk, in a nice large IDE on the side. The mistakes have changed a lot - they are not simple syntax errors anymore, they are subtle conceptual errors that a slightly sloppy, hasty junior dev might do. The most common category is that the models make wrong assumptions on your behalf and just run along with them without checking. They also don't manage their confusion, they don't seek clarifications, they don't surface inconsistencies, they don't present tradeoffs, they don't push back when they should, and they are still a little too sycophantic. Things get better in plan mode, but there is some need for a lightweight inline plan mode. They also really like to overcomplicate code and APIs, they bloat abstractions, they don't clean up dead code after themselves, etc. They will implement an inefficient, bloated, brittle construction over 1000 lines of code and it's up to you to be like "umm couldn't you just do this instead?" and they will be like "of course!" and immediately cut it down to 100 lines. They still sometimes change/remove comments and code they don't like or don't sufficiently understand as side effects, even if it is orthogonal to the task at hand. All of this happens despite a few simple attempts to fix it via instructions in CLAUDE . md. Despite all these issues, it is still a net huge improvement and it's very difficult to imagine going back to manual coding. TLDR everyone has their developing flow, my current is a small few CC sessions on the left in ghostty windows/tabs and an IDE on the right for viewing the code + manual edits."