This still happens quite a bit, and it's just like taking away a hard task from someone less experienced. The difference is there is no point in investing your time teaching or explaining anything to the AI. It can't learn in that way, and it's not a person.
I like to vibe code single self-contained pages in html, css, and JavaScript, because there's a very slim chance that something in the browser is going to break my computer.
This is the problem I have seen a lot. From professionals to beginners, unless you actually carved the rock yourself, you don't really have an insight into every detail of what you see. This is incidentally why programmers tend to like to just rewrite stuff than fix it. We're lazy.
Hmm. And maintenance?
But this is, properly expressed, "see if it works, based on my incomplete understanding of the code that I haven't worked through and corrected by trying to write it, but I have nevertheless communicated directly to an AI that may not have correctly understood it and therefore may not have even the vaguest inkling of the edge cases I haven't thought to mention yet".
Vibe-coded output could only properly be said to "work" if we were communicating our intentions in formal methods logic.
What you mean is "apparently works".
You notice how LLMs do good on small tasks and on small projects? The more LLM code you add to your projects, the slower and worse (read: more tokens) they perform. If this was by design (create bigger, unmaintainable projects, so you can slowly squeeze out more and more tokens out of your users), I'd have applauded the LLM creators, but I think it's by accident. Still funny though.
except someone less experienced never gets to try, since all the experienced programmers are now busy shepherding AI agents around, they aren't available to mentor the next generation.
There will be bugs that the AI cannot fix, especially in the short term, which will mean that code needs to be readable and understandable by a human. Without human review that will likely not be the case.
I'm also intrigued by "see if it works". How is this being evaluated? Are you writing a test suite, manually testing?
Don't get me wrong, this approach will likely work in lower risk software, but I think you'd be brave to go no human review in any non-trivial domain.
Reminds me of Steve Yegge's short-lived CHOP - Chat Oriented Programming: https://sourcegraph.com/blog/chat-oriented-programming-in-ac...
I remain a Karpathy originalist: I still define vibe coding as where you don't care about the code being produced at all. The moment you start reviewing the code you're not vibe coding any more, by the definition I like.
My wife used the $20 claude.ai and Claude Code (the latter at my prompting) to vibe-code an educational game to help our five-year-old with phonics and basic math.
She noticed that she was constantly hitting token limits and that tweaking or adding new functionality was difficult. She realized that everything was in index.html, and she scrolled through it, and it was clear to her that there was a bunch of duplicated functionality.
So she embarked on a quest to refactor the application, move stuff from code to config, standardize where the code looks for assets, etc. She did all this successfully - she's not hitting token limits anymore and adding new features seems to go more smoothly - without ever knowing a lick of JS.
She's a UX consultant so she has lots of coding-adjacent skills, and talks to developers enough that she understands the code / content division well.
What would you call what she's doing (she still calls it vibe coding)?
But I think the fact that she's managing without even knowing (or caring to know) the language the code base is written in means that it isn't really "coding" either.
She's doing the application architecture herself without really needing to know how to program.
I've copy-pasted snippets for tools and languages that I do not know. Refactored a few parameters, got things working. I think that counts as programming in a loose sense. Maybe not software development or engineering, but programming.
The first non-toy program I ever wrote was in BASIC, on ZX Spectrum 48, and although I don't have it anymore, I remember it was one of the shittiest, most atrocious examples of spaghetti code that I've ever seen in my life.
Everyone starts somewhere.
I was picking apart other peoples' javascript to see how it worked years before I was taught anything about coding in a formal setting.
Also I feel old now, I think we _just_ did XP at a company, but that was almost a quarter century ago :D
This didn't really work - it would often implement a factored-out piece of logic but just leave the pre-existing code alone. So she had to be pretty specific about problem areas in order to actually push the refactoring plan forward.
I think orchestration may be a step that can't really be magicked away by advancements, beyond toy implementations. Because at the end of the day it is just adding specific details to the idea. Sure, you can YOLO an idea and the LLMs can get better at magically cleaning things up, but the deeper you go the larger the drift will be from the concept and the reality without continued guidance.
If LLMs could construct buildings you might describe what you want room by room and the underlying structure will need to be heavily revamped at each addition. Unless you start with "I am making a 20 floor building" you are going to waste a lot of time having the LLM re-architect.
I think the real new skill people are going to get scary good at is rapid architecting without any strong awareness of how things work underneath the hood. This is the new "web programming isn't real programming" moment where future developers might not ever look at (or bother learning about!) variables or functions.
I do think we need a new definition for vibe-coding, because the way the term is used today shouldn’t necessarily include “not even reading the code”.
I’m aware that Karpathy’s original post included that idea, but I think we now have two options: - Let the term vibe-coding evolve to cover both those who read the code and those who don’t. - Or define a new term — something that also reflects production-grade coding where you actually read the code. If that’s not vibe-coding, then what is it? (To me, it still feels different from traditional coding.)
I have a few problems with evolving "vibe coding" to mean "any use of LLMs to help write code:
1. Increasingly, that's just coding. In a year or so I'll be surprised if there are still a large portion of developers who don't have any LLM involvement in their work - that would be like developers today who refuse to use Google or find useful snippets on Stack Overflow.
2. "Vibe coding" already carries somewhat negative connotations. I don't want those negative vibes to be associated with perfectly responsible uses of LLMs to help write code.
3. We really need a term that means "using prompting to write unreviewed code" or "people who don't know how to code who are using LLMs to produce code". We have those terms today - "vibe coding" and "vibe coders"! It's useful to be able to say "I just vibe-coded this prototype" and mean "I got it working but didn't look at the code" - or "they vibe-coded it" as a warning that a product might not be reviewed and secure.
Just like no one speaks of vibe-aeronautics-engineering when they’re “just” using CAD.
More specifically, GAIA in SDE produces code systematically with human in the loop to systematically ensure correctness. e.g. Like the systematic way tptacek has been describing recently [2].
[1] https://en.m.wikipedia.org/wiki/Gaia
[2] https://news.ycombinator.com/item?id=44163063
Briefly summarized here I guess: https://news.ycombinator.com/item?id=44296550
Blind-coding.
So, clearly, almost nobody does that anymore. So according to Karpathy's definition, we have all been vibe coding for quite time now. (An aside - if AIs were any good, they would just skip human languages entirely and go straight to binary.)
So I think the "vibe" in vibe coding refers to inputting a fuzzy/uncertain/unclear/incomplete specification to a computer, where the computer will fill in details using an algorithm which in itself is incomprehensible for humans (so they can only "feel the vibes").
Personally, I don't find the fuzziness of the specification to be the problem; on some level it might be desirable, having a programming tool like that. But the unpredictability of the output is IMHO a real issue.
Because the compiler is deterministic and the cost of getting something better (based on the processor capability) is higher than just going with the compiled version, which has like 99.9% chance of being correct (compiler bugs are rare). It's not vibecoding. It's knowing your axioms is correct, when viewing the programming language as a proof system (which it is). You go from some low level semantics, upon which you build higher level semantics which forms your business rules.
So giving the LLM a fuzzy specs is hoping that the stars will align and your ancestors spirits awaken to hear your prayers to get a sensible output.
And I agree with the logic part too. I think we could have humans input fuzzy specification in some formal logic that allows for different interpretations, like fuzzy logic or various modal logics or a combination of them. But then you have a defined and understandable set of rules of how to resolve what would be the contradictions in classical logic.
The problem with LLMs is they are using a completely unknown logic system, which can shift wildly from version to version, and is not even guaranteed to be consistent. They're the opposite of where the software engineering, as an engineering discipline, should be going - to formalize the production process more, so it can be more rigorously studied and easier to reproduce.
I think what SW engineering needs is more metaprogramming. (What we typically call metaprogramming - macros - is just a tip of the iceberg.) What I mean is making more programs that study, modify and transform the resulting programs in a certain way. Most of our commonly used tools are woefully incapable of metaprogramming. But LLMs are decent at it, that's why they're so interesting.
For example, we don't publish version modifications to language runtimes as programs. We could have, for example, produce a program that would automatically transform your code to a new version of the programming language. But we don't do that. It is mostly because we have really just started to formalize mathematics, and it will take some time until we completely logically formalize all the legacy computer systems we have. Then we will be able to prove, for instance, that a certain program can transform a program from one programming runtime to another at a certain maximal incurred extra execution cost.
It's not unknown. And it's not a logic system. Roughly, it takes your prompt, add it to the system prompts, runs it trough a generative program that pattern-match it to an output. It's like saying your MP3 player (+ your mp3 files) is a logic system. It's just data and it's translator. And having a lot of storage to have all the sounds in the world just means you have all the sounds in the world, not that it's automatically a compositor.
And consistency is the basic condition for formalism. You don't change your axioms, nor your rules, so that everyone can understand whatever you said was what you intended to say.
> What I mean is making more programs that study, modify and transform the resulting programs in a certain way.
That certain way is usually fully defined and spec-ed out (again, formalism). It's not about programming roulette, even if the choices are mostly common patterns. Even casinos don't want their software to be unpredictable.
> Most of our commonly used tools are woefully incapable of metaprogramming.
Because no one wants it. Lisp has been there for ages and only macros have seen extensive use, and mostly as a way to cut down on typing. Almost no one has the needs to alter the basic foundation of the language to implement a new system (CLOS is kinda the exception there). It's a lot of work to be consistent, and if the existing system is good enough, you just go with it.
> we don't publish version modifications to language runtimes as programs
Because patching binary is harzadous, and loading programs at runtime (plugins) is nerfed on purpose. Not because we can't. It's a very big can of worms (we've just seen the crowdstrike incident when you're not careful about it).
This is very dumb. Of course you can.
When it's your problem being delegated, you can't delegate consequences away. I can eat for you, but you won't get satiated this way.
You cannot delegate the act of thinking because the process of delegation is itself a decision you have made in your own mind. The thoughts of your delegates are implicitly yours.
Just like if you include a library in your code, you are implicitly hiring the developers of that library onto your project. Your decision to use the library is hiring the people who wrote it, to implicitly write the code it replaces. (This is something I wish more people understood)
That's not to say that these models don't provide value, especially when writing code that is straightforward but can't be easily generalized/abstracted (e.g., some test-case writing, lots of boilerplate idioms in Go, and basic CRUD).
In terms of labor I potentially see this increasing the value (and therefore cost) of actual experienced developers who can approach novel and challenging problems, because their productivity can be dramatically amplified through proper use of AI tooling. At the other end of spectrum, someone who just writes CRUD all day is going to be less and less valuable.
That said, if you spend most of your time sussing out function signatures and micromanaging every little code decision the LLM makes, then that's time wasted imo and something that will become unacceptable before long.
Builders will rejoice, artisan programmers maybe not so much.
Maintainers definitely not so much.
A requirement to do so might lead to more. Like loss of job for the illiterate "programmer".
So you measure “productivity” in lines of code? Say no more.
I'm still not sold on that. Stack overflow UI has lot of signals for a good response. The amount of answers, the upvotes on those answers, the comments,... With a quick scan, you can quickly see if you need to go back to the web search page (I've never used SO search) or do a slower read.
So no, you don’t _need_ to read code anymore. But not reading code is a risk.
That risk is proportional to characteristics that are very difficult, and in many cases impossible, to measure.
So currently best practice would be to continue reading code. Sigh.
https://www.theguardian.com/uk-news/2025/jul/08/post-office-...
This is the logical conclusion of the indiscipline of undereducated developers who have copied and pasted throughout their career.
This reality is then expressed as "but humans also copy and paste" as if this makes it OK to just hand that task over now to an AI that might do it better, where the solution is to train people to not copy and paste.
Everything about AI is the same story, over and over again: just pump it out. Consequences are what lawyers are for.
It's really interesting to me that within basically a generation we've gone from people sneering at developers with old fashioned, niche development skills and methodologies (fortran, cobol, Ada) to sneering at people with the old-fashioned mindset that knowing what your code is doing is a fundamental facet of the job.
That’s already engineering. Your sentiment is cute but I think you have some romantic vision of what „real engineering” is.
Software engineering goes a lot deeper than that; look at any serious accredited syllabus.
But almost nobody practises it these days, you are right. The web kind of blurred the line between software and document for a while and a lot of stuff got lost.
That's not a reason to practise even less.
Keep in mind I have engineering degree and 15 years of experience so you are not writing back to 20 years old hack who just learned PHP.
I guess it's really hard to work with AI agents, if you don't have real project experience in a more senior position.
(To be fair my kind of testing is a lot different than unit tests, and the tests I'm cancelling are mulit-page forms that require three signatures.)
* Treat the AI/ML as a junior programmer, not a senior - albeit one willing to make a leap on basically any subject, nevertheless - a junior is someone whose code must always be questioned, reviewed, and understood - before execution. Senior code is only admissible from a human being. However, human beings may have as many junior AI’s in their armpit as they want, as long as those humans do not break this rule.
* Have good best practices in the first f’in place!!
Vibe-coding is crap because ‘agile hacking’ is crap. Put your code through a proper software process, with a real workflow - i.e. don’t just build it and ship it. Like, ever. Even if you’ve written every line of code yourself - but especially if you haven’t - never ship code you haven’t tested, reviewed, proven, demonstrated in a production analog environment, and certified before release. Yes, I mean it, your broken FNORD hacking habits will be force-magnified immediately by any AI/ML system you puke them into. Waterfall or gtfo, vibe-code’rs…
* Embrace Reading Code. Look, if you’re gonna churn milk into butter, know what both milk and butter taste like, at least for your sake. Don’t ship sour butter unless you’re making cheese, and even then, taste your own cheese. AI/ML is there to make you a more competent human being; if you’re doing it to avoid actually doing any work, you’re doing it wrong. Do it to make work worth doing again….
Passive code reviews ("read by at least two humans") are fraught with error. IMHO, a better mantra is:
Never allow a merge to the main branch that does
not have some amount of documented test coverage,
be it unit, functional, and/or integration specs.Ok, but in this case you can just throw away the tree and a new one will grow immediately for you to review anew. Rinse and repeat.
I'm not saying the author's proposed approach is never the right one, but there is a meaningfully different approach between the two suggested. You can look at the result of a fully autonomous agent, note the issues, tweak your prompt + inputs and then start over. You get the benefits of more closely-steered output without the drag of having to constantly monitor it synchronously. For some things, this approach is token-wasteful, but for small (yet critical / high-value) features, I have found it to work quite well. And an ancillary benefit is that your starting prompts and inputs improve over time.
And what about when you need a Forrest? Seed from trees grow new trees. A whole Forrest cannot be inspected and discarded over and over.
> but for small (yet critical / high-value) features
These are the features than need the human attention. These are the features that are the most fun for a human to make. These are the features that make the human improve the most. So it's the last ones i would leave to the AI.
Really? You mean that person that drains senior developer time, but in which we still invest time because we're confident they're going to turn out great after the onboarding?
Until then, they're a net time sink, aren't they? Unless the projects you've got to deliver are fantastically boring.
So is this really what it's all about? Perpetual onboarding?
It is almost as if understanding the problem to be solved is the hard part.
davidmurdoch•6mo ago
tempodox•6mo ago
patrickmay•6mo ago