Fixed that for you
They're getting better, but a lot of the improvement was driven by increases in the training data. These models have now consumed literally all available information on the planet - where do they go from here?
Coding is arguably the single thing least affected by a shortage of training data.
We're still in the very early steps of this new cycle of AI coding advancements.
Which, believe it or not, is the same issue I see in my own code.
I'm waiting for the day when every comment session on the internet will be full of people predicting AGI tomorrow.
That's a sign that you need to refactor/rearchitect/better-modularize the initial code. My experience is that with no existing code patterns to follow, the LLM will generate a sprawl that isn't particularly cohesive. That's fine for prototyping, but when the complexity of the code gets too much for its context, taking a day or so to reorganize everything more cleanly pays off, because it will allow it to make assumptions about how particular parts of the code work without actually having to read it.
First app was to scrape some data using a browser. It did an excellent job here, went down one wrong path that it obsessed over (it was a good idea in theory and should have worked) and in the end produced a fully-working tool that exceeded my requirements and the UI looked way more polished than I would have bothered with for a tool I wrote for me only.
Second app is a DHT crawler. It has gone down so many dead ends with this thing. The docs for the torrent tools don't match the code, I guess, so it gets horribly confused (so do GPT, Grok, Claude, Gemini). Still not working 100% and I've wasted way more time than it probably would have taken to learn the protocols and write it from scratch.
The main issue is -- I have no idea what the code looks like or really how it works. When I write code I almost always have a complete mental map of the entire codebase and where all the functions are in which files. I literally know none of that. I've tried opening the code on the DHT app and it is mentally exhausted. I nope out and just go back to the agent window and try poking it instead, which is a huge time waster.
So, mixed feelings on this. The scraper app saved me a bunch of time, but it was mostly a straightforward project. The DHT app was more complicated and it broke the system in a bunch of ways.
Try again in 6 months?
It gets the things they want to do done. No paying someone else, no asking for help on $chatprotocolchannel, no getting a friend to help. It's just there. It doesn't matter if it's ugly. It doesn't need to be monetized, doesn't need to be worked on by a team of people. It just needs to work... enough.
Vibe coding may not be for you. But vibe coding is so that we don't need to bother you for trivial things (like porting a 12 .c file X11/cairo program to a single .pl file cariro/gtk program).
e: Not a q for parent, but maybe for others. Are we supposed to be making a distinction between "vibe coding" and "coding with an AI agent"? What is the difference?
I think I might argue the opposite but sure.
As long as you stick within pretty strict limits. I'm seeing more and more folks from non-programming backgrounds who are running up against the limits of vibecoding - and they aren't prepared for just how expensive some of the larger mistakes will be to fix, if they end up having to bring actual software engineers in to fix/scale their products
Or at least before $project would take weeks instead of an hour. I don't know C but now I can just have an LLM translate/port it to perl and I can fix the mistakes in that domain where I do have skill.
The old saying was that ideas are cheap, value is in execution. With LLM coding, execution is cheap too, up to a point.
What LLMs are poised to replace is basically the entirety of the liberal arts and academic disciplines, where output is largely measured on the production of language - irrespective of that language's correspondence to any actual underlying reality. Musicians and authors - fiction and non-fiction alike - are already witnessing their impending obsolescence in real time.
People make music because they want to. They always have, since the dawn of civilization. Even if we end up somehow making music bots that can generate something OK, people won't magically stop wanting to make music.
And for both cases, it's important to remember that ML/LLM/etc can only re-arrange what's already been created. They're not really capable of generating anything novel - which is important for both of those.
LLMs specifically and AI generally are not going to replace your Taylor Swift, Ed Sheeran, Twenty One Pilots, Linkin Park, etc.
Speaking to the parent comment specifically - it strikes me as uninformed that “liberal arts and academic” disciplines rely only on language production. That’s absurd.
I have a hard science degree but I started my education in the humanities and spent time studying business and management in my education as well. That gave me an immense advantage that many technologists lack. More importantly it gave me insight into what people who only have a hard science education don’t understand about those fields. The take that they aren’t connected to reality is chiefly among those misunderstandings.
LLMs can’t replace any field for which advancement or innovation relies upon new thinking. This is different from areas where we have worked out mathematically articulable frameworks, like in organic chemistry. Even in those areas, actual experts aren’t replaceable by these technologies.
On the contrary; nineteenth century leftist thinking, and post-modernism in particular, is infamous for dispensing with the notion of objective reality and the enterprise of science and empiricism entirely, rejecting them as grand narratives that exist merely to perpetuate the ruling power structures [0].
> Speaking to the parent comment specifically - it strikes me as uninformed that “liberal arts and academic” disciplines rely only on language production. That’s absurd.
If everything is narrative in structure and linguistically mediated, then yes, these disciplines are primarily focused on the production of language as a means for actuating reality.
[0] https://www.google.ca/books/edition/Nexus/gYnsEAAAQBAJ?hl=en...
Unfortunately the market for art will filled lowest common denominator stuff generated by AI. Only the high end will be driven by non AI. There's not enough of a market for everyone to do high end stuff.
Wealth is the time required to make great art- Every great artist needs this.
Currently, some artists are able to make a living from their art, and can spend the time and effort to create great art without being independently wealthy- but that is going to become increasingly difficult
Any music that is completely novel is crap. All good music is inspired by, borrows from and outright steals from past music.
“Immature poets imitate; mature poets steal; bad poets deface what they take, and good poets make it into something better, or at least something different. The good poet welds his theft into a whole of feeling which is unique, utterly different than that from which it is torn.”
- TS Eliot
An LLM may certainly be not as good as a human at creating music. But it's certainly capable of creating music that is either more novel or less novel than any arbitrary human composition. You just need randomness to create novel music, but to make it tasteful still requires a human.
This is one of those statements that can be comforting, but is only true for increasingly specific definitions of “novel”. It relies on looking at inputs (how LLMs work) vs outputs (what they actually create).
A multi-modal LLM can draw “inspiration” from numerous sources, enough to make outputs that have never been created before. They can absolutely make novel things.
I think the real question is whether they can have taste. Can they really tell good ideas from bad ones beyond obvious superficial issues?
Someone not musically talented but appreciates music could generate a hundred songs with an AI, pick ten good ones, and have an album that could sell well. The AI is a tool, the user is the only one that really needs taste.
Unfortunately, "want to" doesn't pay the bills. Before recorded music, there was a lot more demand for a jazz saxophone player, and that musician made a decent middle class living off of that skill. Now we have Spotify and don't need to pay that musician. Suno.ai makes very passible music. Midjourney makes passible art. The microwave makes passible food. LLMs write passible books.
Programming as a job will go the way of the saxophone player as a job, but then what's left? Are we a jester economy? One where everyone competes to perform tricks for wealthy billionaires in order to get some money to eat?
There's millions upon millions of pieces of music, books, and other media forms in existence. Pretty much anything humans make is derivative in some way. The idea that humans create totally novel things is sort of silly. TVTropes is a thing for a reason.
I'm definitely not an AI booster but I'm realistic about the stuff humans create. It's entirely possible for AI to write hit pop songs and you'd never know unless someone admitted to it. They're not complicated and extremely derivative even when humans write them.
haha. no, we're not.
LLMs have no understanding of the limited time of human nature. That's why what they produce is flat, lacking emotion, and the food they make sometimes tastes bland, missing those human touches, those final spices that make the food special.
This is the wrong view. It's more like "Soon, everyone will be able to go from idea to a prototype". IMO, there's a different value perception when people can use concrete things even if they are not perfect. This is what I like about end-to-end vibe coding tools. I don't see a non developer using Claude Code but I can totally see them using Github Spark or any similar tool. After that, the question is how can I ensure this person can keep moving forward with the idea.
I think overzealous LLM hype is a sort of Gell-Man amnesia.
Software development doesn't occur in a vacuum -- it's part of a broader ecosystem consisting of tech writers, product managers, sales engineers, support engineers, evangelists, and others. AI coding enables each person in the org to participate more efficiently in the scoping, design, and planning phases of software development.
Again and again and again
I'm using LLM a fair bit so i'm not doom and gloom entirely on it, but i do think our adjustment period is going to be rough. Especially if we can't find a way to make the LLMs actually reliable.
If people have to pay eg $100 for every of those prototypes I doubt they’ll be very popular. Sure it’s still cheaper than paying a dev but it will be expensive to iterate and experiment.
You know how the average dev will roll their eyes at taking over a maintenance of a "legacy" project. Where "legacy" means anything not written by themselves. Well, there will be a lot more of these maintenance takeovers soon. But instead of taking over the product of another dev agency that got fired / bankrupt / ..., you will take over projects from your marketing department. Apps implemented by the designers. Projects "kickstarted" by the project manager. Codebases at the point antropic / google / openai / ... tool became untenable. Most likely labelled as "just needs a little bit more work".
These LLM tools are amazing for prototypes. Amazing. I could not be anywhere near as productive for churning out prototypes as claude code is, even if I really tried. And these prototypes are great tools for arriving at the true (or at least slightly better) requirements.
Prototypes should get burned when "real" development starts. But they usually are not. And we're going to do much, much more prototyping in very near future.
True, but not a new thing! You've never known true development pain until you're told something from another department "needs some help to get productionized", only to find out that it's a 300 tab Excel file with nightmarish cross-tab interdependencies and VBA macros.
Genuinely not sure if vibe coded Python would be an improvement for these type of "prototype" projects. They'll definitely continue to exist, though.
He left that job a week later. It never went on his resume or LinkedIn.
so somewhat worse than what we have now
Last year it was "prompt engineer", or was that 2 years ago already. Things move fast on the frontier...
I don't think the author is fundamentally wrong; but its delivered with a sense of certainty thats similar in tone to the past 5 years of skepticism that has repeatedly been wrong.
Instead of saying "vibe coded codebases are garbage", the author would be better served writing about "what does the perfect harness for vibe coded codebase look like so that it can actually scale to production"?
Can you point to any app with scale that has been vibe coded?
More broadly, its unfortunate that vibe coding is so overloaded a term. - Yes, product managers with 0 coding expertise are contributing code in FAANG. - Yes, experienced engineers are "vibe coding" to great success. - Yes, folks with 0 years of experience were building simple calculators 2 years ago and are now building games, complex website, etc, just by prompting. Where will this go in another two years?
One needs look no further than kiro - Amazons own code editor thats being used extensively internally.
Folks bemoaning vibe coding are simply suffering from lack of imagination.
I also hit the complexity wall and worked through it. LLMs are genius with arms that can reach anything but eyes two inches from the screen. As a metaphor, think of when a code base gets too big for one person to manage everything, people start to "own" different parts of the code. Treat LLMs the same and build a different context for each owner.
The key is to not get to a point where you lose sight of the big picture. You are riding a bronco - try not to get thrown off! If you do, ask the LLM to describe the structure of things or restructure things to be more consistent, etc. Get to a place where YOU understand everything, at least in an architectural sense. Only then can you properly instruct the AI forward.
If agents get so good that they overcome these obstacles then most mid tier companies dev staff is going to be a couple of people making sure the agents are online and running.
Vibe coding is just the canary in the coal mine.
I do agree that worse but cheap will be used a lot right now. But we also have it already with outsourcing, and not everything is outsourced.
Signaling theory provides a useful perspective here. If everyone has access to the same tools for thought, then everyone can arrive at the same output independent (mostly) of skill. That means the value of any one computer generated output will crash towards an average minimum. Society will notice and reward the output that is substantially better and different, because it shows that someone has something far better than the rest of access to or are capable of doing. This is even more heightened when the markets are flooded with AI slop. It’ll become obvious and distasteful.
Those with skills and ability to differentiate will continue their existing ascent.
My point is that the same people predicting those improvements are also predicting that LLMs will soon lead to super human AGIs.
"Don't apologize. Don't say I'm right. Don't say it's a good idea. Don't say something is fixed unless I say so. Don't say that the issue is clear now. Don't use emojis. Keep a detached, professional tone and focus on the matter at hand.".
Worth every token.
I feel like the AI companies are constantly getting ahead of themselves. The recent generation of LLMs is getting really good at writing or modifying code incrementally following a precise specification. But no, of course that's no longer good enough. Now we have agents who are as dodgy as LLMs were a few years ago. It's as if Boeing launched the 707 too early, got it to work after a few (plane) crashes, but then, instead of focusing on that, they launch the 747 also too early, and it also promptly crashes. Little wonder that people will be more preoccupied with the crashes than with what actually works...
Interestingly, I wrote something similar recently "Too Fast to Think: The Hidden Fatigue of AI Vibe Coding", https://www.tabulamag.com/p/too-fast-to-think-the-hidden-fat...
There seems to be something overloading our capacity as coders.
I wouldn't say it is overload, it is just that some things are faster (writing code) and some things are the same. There is backpressure but overall, things are moving slightly faster.
I also don't think it makes much of a difference at a large company when actual implementation is a relatively small part of the job anyway. Am I going to try and save 30 minutes writing a PR when it takes weeks for third-party review and the downtime is millions of dollars a minute...no.
The point being made is that vibe coding is changing so fast that any investments you make today into learning is quickly obsolete tomorrow as someone puts out a new tool/framework/VSCode-fork that automates/incorporates your home-brewed prompt workflow.
It's like a deflationary spiral in economics -- if you know that prices will drop tomorrow, only the sucker buys something today (but because no one is buying today, that demand destruction causes prices to drop, creating a self-fulfilling doom loop).
Similarly: with LLM coding, any investment you spend in figuring out how to prompt-engineer your planning phase will be made obsolete by tomorrow's tools. Really your blog post about "how I made Claude agents develop 5 feature branches in parallel" is free R&D for the next AI tool developer who will just incorporate your tips&tricks (and theoretically even monetize it for themselves)
The argument here (get your pitchforks ready, all ye early adopters) is "we all just need to sit back for 6 months and see how these tools shake out, early adoption is just wasted effort."
On the one hand, hype cycles and bubbles are almost always driven by FOMO (and sometimes the best strategy is to not play). On the other hand, LLMs are legitimately changing how we do coding.
There's always going to be people who need to have the latest iphone or be trying the latest framework. And they're needed -- that's how natural adoption should happen. If these tools tickle your fancy, go forth and scratch that itch. At the very least, everyone should be at least trying these things out -- history of technology is these things keep getting better over time, and eventually they'll become the norm. I think the main perspective to hold, though, is they're not quite the norm yet, and so don't feel like a schlub if you're not always all-in on the trending flavor of the week.
We still don't know the final form these will take. Learn what's out there, but be okay that any path you take might be a dead end or obsolete in a month. Use a critical eye, and place your effort chips appropriately for your needs.
That argument never held any water, and it's not what makes deflation problematic. You just have to look at the first 5 decades of electronics and computers, and how people kept buying those, again and again, despite the prices always going down and quality persistently going up.
The same applies to that argument applied to those tools. If you can use them to create something today (big if here), it doesn't matter that tomorrow tools will be different.
Everyone is doing this sort of "better write some MCPs" thing, so that you can keep the LLM on the straight and narrow.
Well, let me tell you something. I just went through my entire backlog of improvements to my trading system with Claude, and I didn't write any MCPs, I didn't write long paragraphs for every instruction. I just said things like:
- We need a mock exchange that fits the same interface. Boom, here you go.
- How about some tests for the mock exchange? Also done.
- Let's make the mock exchange have a sawtooth pattern in the prices. Ok.
- I want to write some smart order managers that can manage VWAPs, float-with-market, and so on. Boom, here you go.
- But I don't understand how the smart orders are used in a strategy? Ok, here's a bunch of tests for you to study.
- I think we could parse the incoming messages faster. Read the docs about this lib, I think we can use this thing here. Boom, done.
- I have some benchmarks set up for one exchange, can you set up similar for others? Done.
- You added a lock, I don't want that, let's not share data, let's pass a message. Claude goes through the code, changes all the patterns it made, problem solved.
- This struct <here>, it has a member that is initialized in two steps. Let's do it in one. Boom, done.
- I'm gonna show you an old repo, it uses this old market data connector. How do we use the new one instead? Claude suggests an upgrade plan starting with a shared module, continuing towards full integration. Does the first bit, I'm mulling over the second.
Over the last four days, it has revolutionized my code. I had a bunch of things I knew I could do if given enough time. None of the above would stump me as an experienced dev these days. But it would take my attention. I'd be in a loop of edit/compile/test over every feature I've mentioned above, and I would be sure to have errors in either syntax or structure. At some point, I would use a pattern that was not ideal, and I'd have to backtrack and type/google/stackoverflow my way out of, and each step would take a while.
Now, I can tell Claude what to do, and it does it. When it fails, it's productively failing. It's closer to the target, and a gentle nudge pushes it to where I want.
But if true, that's a good thing. It means once you get to a certain structure, edits are now close to free.
First, it's always some 'telegram bot' type project where it all starts to break down when they try to add too many features on the existing (buggy) features without understanding what the stupid robot is up to.
Second, they all come to the conclusion they don't want to be 'unpaid project managers' and it's better for the entire world for people to get paid the $100k+ salaries to write crappy javascript.
During the heart of Covid, when the scientific archives were opened for everyone, I downloaded a bunch of stuff and have been working through some of it with the help of Claude. Perhaps surprising to the vibe coders, if you constrain the robots and work through their lack of 'intelligence' they're pretty good at taking the theoretical concepts from paper A, B and C and combining them into a cohesive system which can be implemented as individual modules that work together. I believe there used to be a term of art for this, dunno?
You can also stumble on things which nobody really realized before as the theoreticians from Paper A don't go to the same parties as the theoreticians from Paper B so they don't realize they're working on the same problem using a different 'language' and only when you try to mash them together does the relationship become obvious. Having a semi-useful robot to think through something like this is indispensable as they can search through the vast databanks of human knowledge and be like "no, you're an idiot, everyone knows this" or "yeah, that seems like something new and useful".
So, yeah, horses for courses...
His projects were pure glue code. One was some data sets + some data visualization with charts + some map api. A few hours of work, it looked rather remarkable.
Our chat was specially hilarious since I take writing everything from scratch to the absurd level. His one trick was to know all the places online where you can ask your noob developer questions. He was working with a kind of imaginary token system where you get to ask a limited number of questions per month on each platform.
His queries were truly fantastic. A reasonable dev could answer them with little effort, sometimes the next guy would write out the glue code. He didn't ask for much.
13 seems some kind of magical sweet spot where we take the person just serious enough.
Sometimes people had questions about the API, he knew those inside out which was a hilarious contrast.
I asked him: What do you do when it stops working? The answer was to start from scratch. He posted the old glue code, explained it didn't work anymore and someone would answer the easy question.
The process was so completely alien to me that I'm not going to guess where the LLM's are taking us.
And yes I know AI is marketed as more. But it’s still people’s fault for swallowing the PR and shipping crappy code then complaining about the lies. Stop deflecting responsibility for your work
For example, say you have a SAR image, and want to classify if there are some vessels on that image, and if so - identifying what vessels? A company will do that for you. For approximately $100 pr. image. Or maybe some tool to analyze the radar parameters of some object (like what sort of radar a vessel or airplane has), also something niche companies will do for you, at a price. People would be surprised to hear now many thousands a license to some analysis package costs annually.
The people that use these services, tend to be analysts and domain experts. Personally I have SWE experience prior to moving to analysis, but I've always been too bogged down with analysis-work to actually develop the tools I've wanted. And budget constraints have made it hard to hire a team of devs to only work on building tools.
Luckily the advancements in LLMs have made that part much, much easier. So now we are able to get together, plan out the tools we want, and actually get things done.
I was maybe a bit loose with the "vibe-coding" part. We are mainly domain experts, but the software engineering experience varies from very little - to quite experienced, so we're not completely fumbling in the dark.
The job of a SaaS product owner is to ensure that you're as screwed-as-possible without their product. That's not the same as ensuring that you've met your own goals. Having five or six such people all fighting over your workflow with no coordination between them (since they don't know what other SaaS products you're consuming) is 3 parts helpful, 4 parts confusing.
People who need help with software should hire software engineers, not buy a product built by software engineers. That way the engineers can gather domain expertise since their co-workers are specialists in that domain, and also the domain experts can gather software expertise in the same way. What we're doing creates needless complexity, silos, and fragility on both sides.
The view I most agree with this discourse. That's why I am not enthusiastic about AI
Retail store clerk was a medium skill/medium prestige job up through the 1970's and 1980's. You needed someone who knew all of the prices for their part of the store (to prevent tag switching), who could recognize what was the expensive dress versus the cheap dress, and remembered all of the prices so they could work quickly without having to look things up (1). There would be stories of retail clerks rising up to CEO, and even if you weren't that ambitious it was a career path that could provide a comfortable middle-class lifestyle.
Then the laser/upc system came in and hallowed out the career path. UPC codes could be printed in a more permanent way (don't need to put 10c stickers on every orange) and then it was just a DB lookup to get the current price. And there was a natural language description that the register printed so you could do a quick confirmation.
This quickly hallowed out and destroyed the career path: once the laser/UPC/database system meant that almost anyone could be an okay retail clerk with very little training, companies quickly stopped caring about getting anyone above that level, experience no longer mattered- or was rewarded with pay increases, and it created the so-called "McJob" of the 1990s.
Karl Marx had actually written about this all the way back in the 19th Century, this was the alienation of labor, the use of capital investment to replace skilled labor with interchangeable repetition that kept people from feeling satisfaction or success at their jobs- and the pay and career satisfaction of retail clerks in 1990s America followed almost exactly the path of skilled weavers being replaced by machines in 19th Century Birmingham.
Will that happen with SWE? I don't know. But it is a thing that preys on my mind late at night when I'm trying to sleep.
Another historical example I find myself thinking about is the huge number of objects that moved into a disposable category in the last century or so. What used to be built entirely by skilled hands is now the prerogative of the machine and thus a certain base level of maintainability is no longer guaranteed in even the worst-designed widget.
Yes, AI will mean more software is "written" than before, but more will be thrown away too. The allure of the re-write is nothing new, but now we'll do it not because we think we can do it better the second time, but because the original is too broken to keep using and starting over is just so much cheaper.
Soon at CRUD App HQ (if not already): "Last year's pg_dump is this year's prompt."
once we figure out the right building blocks of applications, it will simply be inefficient to write code yourself.
perhaps a controversial take, but writing code is going away (99% of it).
1. write custom components on the fly (ai can)
2. compose arbitrary components (composition isn't solved)
We're in a precarious spot where code is cheap to generate thanks to LLMs, but it's hard to build Good Software™ with only generated code. It's a pit of despair, of sorts.
If we can get LLMs to a place where that's no longer true, then writing code going away may be the new base case, but there's no guarantee we can extend LLMs that far.
molteanu•5h ago
I remember from the days when watching TV. There were these preposterous commercials saying "23% more efficient than other toothpastes" or "33% less dandruff than a regular shampoo" or shit like that. How do you know what products do I use? How do you measure that? What skin type? No. It is just better. Trust us.
I mean, the financial backing in this sector is staggering. We know that already. It's a fact. There are also numbers. Billions if not trillions of them. What does Joe The Developer think all this kind of money goes to? Some of them, and not a small part, goes into marketing. Unless Joe still believes in the "build it and they will come" fake motto. Whomever has a stake in this will back it up, marketing it like crazy. Assume victory even in defeat as the old guy says. I was laughing hard one day when I saw Ilya Sutskever, laptop in hand, strolling the parks for a green meadow to work, develop ground-breaking ideas to save humanity! That's just marketing.
Liked your post. I don't think it matters (that much) that your native language is not English. We don't want to sound all the same by using AI to fix our grammar (ok, maybe this one, yes) or the awkward twists of sentences. Sometimes AI fixes them too good, leaving little room for some poetry into it.
florianherrengt•5h ago