vs. what this author is doing, which seems more like agent assisted coding than "vibe" coding.
With regard to the subject matter, it of course makes sense that managing more features than you used to be able to manage without $AI_MODEL would result in some mental fatigue. I also believe this gets worse the older you get. I've seen this within my own career, just from times of being understaffed and overworked, AI or not.
Agreed. I've seen some folks say that it requires absolute ignorance of the code being generated to be considered "vibe coded". Though i don't agree with that.
For me it's more nuanced. I consider a lack of review to be "vibed" related to how little you looked at it. Considering LLMs can do some crazy things, even a few ignored LOC might end up with a pretty "vibe coded" feelings, despite being mostly reviewed outside of those ignored lines.
Or here: https://en.wikipedia.org/wiki/Vibe_coding
Not looking at the code at all by default is essential to the term.
Ie you could say you vibe'd 95% of the PR, and i'd agree with that - but are you vibe coding then? You looked at 5% of the code, so you're not ignoring all of the code.
Yet in the spirit of the phrase, it seems silly to say someone is not vibe coding despite ignoring almost all of the code generated.
Instead of temporarily suspended.
Whatever happened to the word suspended for temporary and ban for permanent and places say permanent with an expiration date.
Nobody alive has ever been electrocuted, but you will meet people who claim to have been.
Then lots of people were introduced to the term "vibe coding" in these conversations, and so naturally took it as a synonym for using LLMs for coding assistance even when reading the code and writing tests and such.
Also because vibe coding just sounds cool.
In other words, everyone's in on the joke.
No. The Etymology of Hacker in the technical scene started at MIT's Tech Model Railroad Club in the late 1950s/early 1960s, "hack" described clever, intricate solutions, pranks, or experiments with technology.
A hacker is one who made those clever solutions, pranks, and technology experiments. "Hacker News" is trying to take it back from criminal activity.
Disagree. Vibe coding is even more powerful if you know what you're doing. Because if you know what you're doing, and you keep up with the trends, you also know when to use it, and when not to. When to look at the code or when to just "vibe" test it and move on.
> When to look at the code or when to just "vibe" test it and move on.
I'm really curious how you're ensuring the code output by whatever LLM you're using, is actually doing what you think it's doing.
Here's a recent example where I used this pattern: I was working on a (micro) service that implements a chat based assistant. I designed it a bit differently than the traditional "chat bot" that's prevalent right now. I used a "chat room" approach, where everyone (user, search, LLM, etc) writes in a queue, and different processes trigger on different message types. After I finished, I had tested it with both unit tests and scripted integration tests, with some "happy path" scenarios.
But I also wanted to see it work "live" in a browser. So, instead of waiting for the frontend team to implement it, I started a new session, and used a prompt alongt he lines of "Based on this repo, create a one page frontend that uses all the relevant endpoints and interfaces". The "agent" read through all the relevant files, and produced (0 shot) an interface where everything was wired correctly, and I could test it, and watch the logs in real-time on my machine. I never looked at the code, because the artifact was not important for me, the important thing was the fact that I had it, 5 minutes later.
Fun fact, it did allow me to find a timing bug. I had implemented message merging, so the LLM gets several messages at once, when a user types\n like\n this\n and basically adds new messages while the others are processing. But I had a weird timing bug, where a message would be marked as "processing", a user would type a message, and the compacting algo would all act "at the same time", and some messages would be "lost" (unprocessed by the correct entity). I didn't see that from the integration tests, because sometimes just playing around with it reveals such weird interactions. For me being able to play around with the service in ~5 minutes was worth it, and I couldn't care less about the artifact of the frontend. A dedicated team will handle that, eventually.
A negative but courteous remark is "slamming", a tweet is an "attack", etc.
So yeah I'm not surprised that people conflate any use of AI with vibe-coding.
Sounds more like de-volution to me.
Not to mention all the attempts we see nowadays at deliberate redefinition of words, or the motte-and-bailey games played with jargon vs. lay understandings of a concept.
I do not think that is a new thing in human history, too. But sure, the internet amplified it a lot.
My interpretation is that you can look at the code but vibe coding means ultimately you're not writing the code, you're just prompting. It would make sense to prompt "I'd like variable name 'bar' to be 'foo' instead." and that would still be vibe coding.
Personally I think "vibe-coding" has semantically shifted to mean any AI-assisted coding and we should just run with it. For the original meaning of vibe-coding, I suggest YOLO-Coding.
I've found that if an LLM writes too much code, even if I specified what it should be doing, I still have to do a lot of validation myself that would have been done while writing the code by hand. This turns the process from "generative" (haha) to "processing", which I struggle a lot more with.
Unfortunately, the reason I have to do so much processing on vibe code or large generated chunks of code is simply because it doesn't work. There is almost always an issue that is either immediately obvious, like the code not working, or becomes obvious later, like poorly structured code that the LLM then jams into future code generation, creating a house of cards that easily falls apart.
Many people will tell me that I'm not using the right model or tools or whatever but it's clear to me that the problem is that AI doesn't have any vision of where your code will need to organically head towards. It's great for one shots and rewrites, but it always always always chokes on larger/complicated projects, ESPECIALLY ones that are not written in common languages (like JavaScript) or common packages/patterns eventually, and then I have to go spelunking to find why things aren't working or why it can't generate code to do something I know is possible. It's almost always because the input for new code is my ask AND the poorly structured code, so the LLM will rarely clean up it's own crap as it goes. If anything, it keeps writing shoddy wrapper around shoddy wrappers.
Anyways, still helpful for writing boilerplate and segments of code, but I like to know what is happening and have control over how my code is structured. I can't trust the LLMs right now.
I could see other elements of isolation being useful, but this kind of feels like a lot of extra work and complexity which is part of the issue...
I still think it saves me time on net and yes, it typically can handle a lot on its own, but whenever it starts to fuck up the same request repeatedly in different ways, all I can really do is sigh/roll my eyes and then it's on me alone to dig in and figure it out/fix it to keep making progress.
And usually that consists of incredibly ungratifying, unpleasant work I'm very much not happy to be doing.
I definitely have been able to do more side projects for ideas that pop into my head thanks to CC and similar, and that part is super cool! But other times I hit a wall where a project suddenly goes from breezy and fun to me spending hours reading through diffs/chat history trying to untangle a pile of garbage code I barely understand 10% of and have to remind myself I was supposed to be doing this for "fun"/learning, and accomplishing neither while not getting paid for it.
Coolest bit of research I cam across was what the brain does during sleep. It basically reduces connection during this. But it also makes you hallucinate (sleep). This was found in researching fish and also training LLMs there's great value in "forgetting" for generalization.
After studying it in LLMs for awhile I also came to your same conclusion about my own brain. Problems are often so complex you must let your brain forget in order to handle the complexity in the same sense I also believe this is the path to AGI.
I recently used a coding agent on a project where I was using an unfamiliar language, framework, API, and protocol. It was a non-trivial project, and I had to be paying attention to what the agent was doing because it definitely would go off into the weeds fairly often. But not having to spend hours here and there getting up to speed on some mundane but unfamiliar aspect of the implementation really made everything about the experience better.
I even explored some aspects of LLM performance: I could tell that new and fast changing APIs easily flummox a coding agent, confirming the strong relationship of up-to-date and accurate training material to LLM performance. I've also seen this aspect of agent assisted coding improve and vary across AIs.
[1] https://skepchick.org/2020/10/the-dunning-kruger-effect-misu...
At some point you realize if you want people to trust you you have to do this. Otherwise you’re just gambling, which isn’t very trustworthy.
It’s also got the cumulative effect of making you a good developer if done consistently over the course of your career. But yes, it’s annoying and slow in the short term.
Red flag. In other words you don’t understand the implementation well enough to know if the AI has done a good job. So the work you have committed may work or it may have subtle artefacts/bugs that you’re not aware of, because doing the job properly isn’t of interest to you.
This is ‘phoning it in’, not professional software engineering.
At least when the AI does it you can review it.
Which is why you spend time upfront becoming familiar with whatever it is you need to implement. Otherwise it’s just programming by coincidence [1], which is how amateurs write code.
> and are probably going to even insert more footguns than the AI.
Very unlikely. If I spend time understanding a domain then I tend to make fewer errors when working within that domain.
> At least when the AI does it you can review it.
You can’t review something you don’t understand.
[1] https://dev.to/decoeur_/programming-by-coincidence-dont-do-i...
Red flag again! If your protection is to "understand the implementation" it means buggy code. What makes a code worthy of trust is passing tests, well designed tests that cover the angles. LGTM is vibe testing
I go as far as saying it does not matter if code was written by a human who understands or not, what matters is how well it is tested. Vibe testing is the problem, not vibe coding.
(Sorry, but you set yourself up for this one, my apologies.)
Oh, so this post describes "worthy code", okay then.
https://news.ycombinator.com/item?id=18442941
Tests are not a panacea. They don't care about anything other than what you test. If you don't have code testing maintainability and readability, only that it "works", you end up like the product in that post.
Ultimate example: Biology (and everything related, like physiology, anatomy), where the test is similarly limited to "does it produce children that can survive". It is a huuuuuge mess, and trying to change any one thing always messes up things elsewhere in unexpected and hard or impossible to solve ways. It's genius, it works, it sells - and trying to deliberately change anything is a huge PITA because everything is interconnected and there is no clean design anywhere. You manage to change some single gene to change some very minor behavior, suddenly the ear shape changes and fur color and eye sight and digestion and disease resistance, stuff like that.
In my 40 years of writing code, I’ve worked on many different code bases and in many different organisations. And I never changed a line of code, deleted code, or added more code unless I could run it in my head and ‘know’ (to the extent that it’s possible) what it will do and how it will interact with the rest of the project. That’s the job.
I’m not against using AI. I use it myself, but if you don’t understand the scope fully, then you can’t possibly validate what the AI is spitting out, you can only hope that it has not fucked up.
Even using AI to write tests will fall short if you can’t tell if the tests are good enough.
For now we still need to be experts. The day we don’t need experts the LLMs should start writing in machine code, not human readable languages
> I do not need to understand the full stack.
Nobody said that. It’s important to understand the scope of the change. Knowing more may well improve decision making, but pragmatism is of course important.
Not understanding the thing you’re changing isn’t pragmatism.
I'm not a professional SWE, I just know enough to understand what the right processes look like, and vibe coding is awesome but chaotic and messy.
There is a big difference between vibe coding and llm assisted coding and the poster above seems to be aware of it.
It was already obvious from your first paragraph - in that context even the sentence "everything works like I think it should" makes absolute sense, because it fits perfectly to limited understanding of a non-engineer - from your POV, it indeed all works perfectly, API secrets in the frontend and 5 levels of JSON transformation on the backend side be damned, right ;) Yay, vibe-coding for everyone - even if it takes longer than the programming the conventional way, who cares, right?
Longer than writing code from scratch, with no templates or frameworks? Longer than testing and deploying manually?
Even eight years ago when I left full-stack development, nobody was building anything from scratch, without any templates.
Serious questions - are there still people who work at large companies who still build things the conventional way? Or even startups? I was berated a decade ago for building just a static site from scratch so curious to know if people are still out there doing this.
Hopefully you aren't discouraged by this, observationist, pretty clear hansmayer is just taking potshots. Your first paragraph could very well have been written by a professional SWE who understood what level of robustness was required given the constraints of the specific scenario in which the software was being developed.
Which to me, as a professional SWE, seems like a very engineer thing to think about, if I've read both of your comments correctly.
AI removes boredome AND removes the natural pauses where understanding used to form..
energy goes up, but so does the kind of "compression" of cognitive things.
I think its less a quesiton of "faster" or "slower" but rather who controls the tempo
compression is exactly what is missing for me when using agents, reading their approach doesn't let me compress the model in my head to evaluate it, and that was why i did programming in the first place.
I agree it can be energizing because you can offload the bullshit work to a robot. For example, build me a CRUD app with a bootstrap frontend. Highly useful stuff especially if this isn't your professional forte.
The problems come afterwards:
1. The bigger the base codebase generation the less likely you're going to find time or energy to refactor LLM slop into something maintainable. I've spent a lot of time tailoring prompts for this type of generation and still can't get the code to be as precise as something an engineer would write.
2. Using an unfamiliar language means you're relying entirely on the LLM to determine what is safe. Suppose you wish to generate a project in C++. An LLM will happily do it. But will it be up to a standard that is maintainable and safe? Probably not. The devil is in the mundane details you don't understand.
In the case of (2) it's likely more instructive to have the LLM make you do the leg work, and then it can suggest simple verifiable changes. In the case of (1) I think it's just an extension of the complexity of any project professional or not. It's often better to write it correct the first time than write it fast and loose and then find the time to fix it later.
You, too, can be awarded the Order of Labor Glory, Third Class.[1]
You didn’t find that to be a little too much unfamiliarity? With the couple of projects that I’ve worked on that were developed using an “agent first” approach I found that if I added too many new things at once it would put me in a difficult space where I didn’t feel confident enough to evaluate what the agent was doing, and when it seemed to go off the rails I would have to do a bunch of research to figure out how to steer it.
Now, none of that was bad, because I learned a lot, and I think it is a great way to familiarize oneself with a new stack, but if I want to move really fast, I still pick mostly familiar stuff.
I was ready to find that it was a bit much. The conjunction of ATProto and Dart was almost too much for the coding agent to handle and stay useful. But in the end it was OK.
I went from "wow that flutter code looks weird" to enjoying it pretty quickly.
So don't. It is vibe coding, not math class. As long as it looks like it works then all good.
But we're seeing that this becomes OK in the workplace, and I don't believe it is.
If you propose these changes that would've normally taken you 2 weeks as your own in a PR, then I, as the reviewer, don't know where your knowledge ends and the AI's hallucinations begin.
Do you need to do all of these things? Or is it because the most commonly forked template of this piece of code has this in its boilerplate? I don't know. Do you?
How can you make sure the code works in all situations if you aren't even familiar with the language, let alone the framework / API and protocol?
* Do you know that in Java you have to do string.Equals instead of == for equality?
* Do you know in Python that you assigning a new value to a function default persists beyond the function?
* And in JavaScript it does not?
* Do you know that the C# && does not translate to VB.NET's And?I feel like it would just yield a well-formatted, type safe incorrect solution, which is no better than a tangled mess.
await page.waitForLoadState('networkidle');
But this isn't good, and is not encouraged. So much so that there's an eslint that suggests removing it. This means that by running the linter, if Claude does decide to inject this, it gets taken out, because the linter runs and then tells it so.But I 100% agree. It's liberating to focus on the design of my project, and my mental model can be of how I want things to work.
It feels like that switch to test driven development where you start from the expected result and worry about the details later.
To your point, you can blow through damn-near anything pretty quickly now. Now I actually find myself problem-solving for nearly 8 hours every day. My brain feels fried at the end of the day way more than it used to.
I used to be like: "well, this thing will take me at least half a day, it's already 16:00, so I'll better do something quiet to cooldown to the end of the day and tackle this issue tomorrow". I'll leave the office in an regular mood and take the night to get ready for tomorrow.
Now I'm like: "17:30? 30 minutes? I have time to tackle another issue today!" I'll leave the office exhausted and take the night to try and recover from the day I had.
It's often that just getting started at all on a task is the hardest part. That's why writers often produce a "vomit draft" (https://thewritepractice.com/vomit-first-draft/) just to get into the right frame of mind to do real writing.
Using a coding agent to fix something trivial serves the same purpose.
I take breaks.
But I also get drawn to overworking ( as I'm doing right now ), which I justify because "I'm just keeping an eye on the agent".
It's hard work.
It's hard to explain what's hard about it.
Watching as a machine does in an hour what would take me a week.
But also watching to stop the machine spin around doing nothing for ages because it's got itself in a mess.
Watching for when it gets lazy, and starts writing injectable SQL.
Watching for when it gets lazy, and tries to pull in packages it had no right to.
We've built a motor that can generate 1,000 horse power.
But one man could steer a horse.
The motor right now doesn't have the appropriate steering apparatus.
I feel like I'm chasing it around trying to keep it pointed forward.
It's still astronomically productive.
To abandon it would be a waste.
But it's so tiring.
Make it stop. Tell it to review whether the code is cohesive. Tell it to review it for security issues. Tell it to review it for common problems you've seen in just your codebase.
Tell it to write a todo list for everything it finds, and tell it fix it.
And only review the code once it's worked through a checklist of its own reviews.
We wouldn't waste time reviewing a first draft from another developer if they hadn't bothered looking over it and test it properly, so why would we do that for an AI agent that is far cheaper.
The thinking should probably include this kind of introspection (give me a million dollars for training and I'll write a paper) but if it doesn't you can just prompt it to.
Like when I'm asking it to run a bunch of tests against the UI using a browser tool, and something doesn't work. Then it goes and just writes code to update the database instead of using the user element.
My other thing that makes me insane is when I tell it what to do, and it says, "But wait, let me do something else instead."
The other tax is the intermittent downtime when you are waiting for the LLM to finish. In the olden days you might have productive downtime waiting for code to compile or a test suite to run. While this was happening you might review your assumptions or check your changes or realize you forgot an edge case and start working on a patch immediately.
When an LLM is running, you can't do this. Your changes are being done on your behalf. You don't know how long the LLM will take, or how you might rephrase your prompt if it does the wrong thing until you see and review the output. At best, you can context switch to some other problem but then 30 seconds later you come back into "review mode" and have to think architecturally about the changes made then "prompt mode" to determine how to proceed.
When you are doing basic stuff all of this is ok, but when you are trying to structure a large project or deal with multiple competing concerns you quickly overwhelm your ability to think clearly because you are thinking deeply about things while getting interrupted by completed LLM tasks or context switching.
This fine for WFH/remote work. It didn't have great optics when I went back to in-office for a bit.
How'd you reckon?
This statement resonates with me. Vibe coding gets the job done quickly, but without the same joy. I used to think that it was the finished product that I liked to create, but maybe it's the creative process of building. It's like LEGO kits, the fun is putting them together, not looking at the finished model.
On the flip side, coding sessions where I bang my head against the wall trying to figure out some black box were never enjoyable. Nor was writing POCOs, boilerplate, etc.
To people with little to no practical software experience, I can see why that seems incredible. Think of the savings! But to anyone who's worked in a legacy code base, even well written ones, should know the pain. This is worse. That legacy code base was at least written with intention, and is hopefully battle tested to some degree by the time you look at it. This is 20k lines of code written by an intern that you are now responsible for going through line by line, which is going to take at least as long as it would have to write yourself.
There are obvious wins from AI, and agents, but this type of development is a bad idea. Iteration loops need to be kept much smaller, and you should still be testing as you go like you would when writing everything yourself. Otherwise it's going to turn into an absolute nightmare fast.
...almost as if it's too eager to make its first commit. Much like a junior engineer might be.
It's not eager enough to iterate. Moreover, when it does iterate, it often brings along the same wrong solutions it came up with before.
It's way easier to keep an eye on small changes while iterating with AI than it is with letting it run free in a green field.
Even using it to spitball ideas can be a problem. I was using Claude to bounce ideas off of for a problem I was working on it, and it was dead set a specific solution involving a stack and some complex control logic was correct, when it reality it would have made the entire solution far more complicated. All I really needed was a sliding window into an array.
It's now 11:47am and I am mentally exhausted. I feel like my dog after she spends an hour at her sniff-training class (it wipes her out for the rest of the day.)
I've felt like that on days without the meetings too. Keeping up with AI tools requires a great deal of mental effort.
They ask a business question to the AI and it generates a bunch of code.
But honestly, coding isn't the part that slowed me down. Mapping the business requirements to code that doesn't fail is the hard part.
And the generated PRs are just answers to the narrow business questions. Now I need to spend time in walking it all back, and try to figure out what the actual business question is, and the overall impact. From experience I get very little answer to those questions.
And this is where Software Engineering experience becomes important. It's asking the right questions. Not just writing code.
Next to that I'm seeing developers drinking the cool-aid and submitting PRs where a whole bunch of changes are made, but they don't know why. Well, those changes DO have impact. Keeping it because the AI suggested it isn't the right answer. Keeping it because you agree with the AI's reasoning isn't the right answer either.
Usually that requires saying something, seeing if the other person understands what I'm saying, and occasionally repeating myself in a different way.
It can be real tiring when I'm with friends who only speak the other language so we're both using translator tools and basically repeating that loop up to 2-3 hours.
I've found the same situation with vibe coding. Especially when the model misunderstands what I want or starts going off on a tangent. sometimes it's easier to edit the original query or an earlier step in the flow and re-write it for a better result.
However, when it comes to my professional work on a mature, advanced project, I find it much easier to write the code myself than to provide a very precise specification without which the LLM wouldn't generate code of a sufficiently high quality.
Maybe the fatigue comes from that mismatch?
The classical vibe coder style is to just ignore verification. That's not a good approach as well.
I think this space has not matured yet. We have old tools (test, lint) and some unreliable tools (agent assisted reviews), but nothing to match the speed of generation yet.
I do it by creating ad-hoc deterministic verifiers. Sometimes they'll last just a couple of PRs. It's cheap to do them now. But also, there must be a better way.
If you're able to blaze through feature tickets using GenAI on existing projects of any major complexity, there's almost certainly something which would produce better code that you're skipping.
I have plenty of info and agents for Claude Code to take into account when I use it to make features in our projects, but what it can't know is the cadence of what our business partners expect, the unknown unknowns, conversations that humans have about the projects, and the way end-users feel about the project. My job is to direct it with those factors in mind, and the work to account for those factors takes time.
windex•3h ago
esafak•3h ago