There's an interesting repository with 63600 stars on GitHub (1). The developer of the repository is No 1 at the GitHub's trending contributors list (2). However, it seems like the application isn't what it's described to be (3), and the developers, on their end, are unable to clearly answer whether this is real or not, as it's just messy LLM output.
Proof that the suit alone doesn't make anyone Iron Man.
1. https://github.com/ruvnet/RuView
So while the author's points are completely true and valid, an executive will say "True, but Claude will get smarter faster than these problems and in 3 years it'll fix everything" and there's absolutely nothing you can say or do in response to this.
The code it generated was awful. The kind of garbage that people who don’t know any better would ship: it looked right and it worked. But it was instantly a maintenance dead end. But I had an effortless time converging on a design that I wouldn’t have been able to do on my own (I’m not a designer). And then I had a reference design and I manually implemented it with better code (the part I am good at).
Which I think is perfectly worthy of exploration. Some people want to check in the prompts. Or even better, check in a plan.md or evenest betterest: some set of very well-defined specifications.
I'm not sure what the answer will be. Probably some mix of things. But today it is absolutely imperative that the code I write for the case I wrote it in is good quality and can be maintained by more than just me.
I never tried spec driven development for myself, but if I review other's MRs I am typically exhausted after the first 10 lines.
And there are hundreds of lines, nearly always with major inaccuracies.
For myself I always found the plan mode to work well. Once the implementation is done, the code is the source of truth. If it works, it works.
When I want to add more functionality or change it, I just tell the agent what I want changed.
I doubt walls of semi-accurate existing specs are going to be beneficial there, but maybe my work differs from yours.
I mitigate this by few things: 1. Checkpoints every few days to thoroughly review and flag issues. Asking the LLM to impersonate (Linus Torvalds is my favorite) yields different results. 2. Frequent refactors. LLMs don't get discouraged from throwing things out like humans do. So I ask for a refactor when enough stuff accumulates. 3. Use verbose, typed languages. C# on the backend, TypeScript on the frontend.
Does it produce quality code? Locally yes, architecturally I don't know - it works so far, I guess. Anyway, my alternative is not to make this software I'm writing better but not making it at all for the lack of time, so even if it's subpar it still brings business value.
I suppose you could solve that in two ways. Manually rewrite it as you did. Or formalize an architecture and let the AI rewrite it with that in mind. I suspect that either works.
I'm not a designer either, but I've been around designers long enough to recognize when something is bad but just not know what is needed to make it better/good. I've taken time to find sites that are designed well and then recreated them by hand coding the html/css to the point that I consider myself pretty decent at css now. I don't need libraries or frameworks. My css/html is so much lighter than what's found in those frameworks as well. I still would not call myself a designer, but pages look like they were designed by a mediocre designer rather than an engineer :shrug:
In the Tailwind thread the other day I was explicitly told that the intended experience of many frameworks is "write-only code" so maybe this is just the way of the future that we have to learn to embrace. Don't worry how it's all hooked up, if it works it works and if it stops working tell the AI to fix it.
It's kind of liberating I guess. I'm not sure if I've reached AI nirvana on accepting this yet, but I do think that moment is close.
Which is probably why so many random buttons in microsoft/apple/spotify just stop working once you get off the beaten path or load the app in some state which is slightly off base
The people pushing AI _over_ humans never thought they were. They just don't care about 'good' or 'bad', only 'time-to-market'. A bad app making money is better than a good one that isn't deployed yet. And who cares about anything past the end of the quarter? That's the next guy's problem.
At the moment, we understand the basic tech, could reasonably DIY, but choose not to knowing full well there's a mess of understandable code somewhere we could go clean up but dont want to. We accept fast iterations because we know roughly the shape of how it "should be" and can guide an automated framework towards that. This is especially true on our own projects or something we built originally! Stark/Iron man knew/moved, the suit assisted by adding momentum.
We're riding our "knowledge momentum".
If companies can hold out long enough, that knowledge completely fades, and the tool is all you have. At that point, they are locked in. Then it's not Iron man, it's an Iron lung (couldn't resist!)
I love the Iron lung reference. Perfect.
the power comes from creating the machine you can steer. Treat AI like an over eager college intern who you need to hand hold, but do tasks.
- first I've created a skill how the architecture of the system should look like
- I'll tell the LLM to follow the guidelines; it will not do that 100%, but it will be good enough
- I'll go through what it produced, align to the template; if I like something (either I've not thought about the problem in that way, or simply forgot) I add that to the skill template
- rinse and repeat
This is not only for architecture of the system, but also when (and how to) write backend, frontend, e2e tests, docs. I know what I want to achieve = I know how the code should be organized and how it should work, I know how tests should be written. LLMs allow me to eliminate the tediousness of following the same template every time. Without these guardrails it switches patterns so often, creating unmaintainable crap
Bear in mind - the output requires constant supervision = LLM will touch something I told it not to touch, or not follow what I told it to do. The amount of the output can also sometimes be overwhelming (so, peer review is still needed), but at this point I can iterate over what LLM produces with it, with another LLM, then give to a human if it together makes sense
What folks seem to avoid is that a Junior (in ANY subject) has the ability to LEARN so much faster with an AI research assistant, and that becoming an expert has accelerated for those with the personal stamina to dig deep (this as a requirement hasn't changed). I spend just as much time with my AI tooling asking questions as I do asking it to "build" or "fix" things. "How does this work?". "Can you suggest other tools?".
I think some people always think about AI as an input / output relationship, when a lot of the time, the fiddling in between, with or without AI was always the important part. Yes people will suck in the beginning, against they always did. I think the good folks though will suck for a MUCH shorter time than I did getting into things.
A lot of people will drop out and get discouraged. That happened before too. Learning things requires persistence. I think the only real case to be made is that AI's sense of immediate pleasure can neuter people away from running into friction. AI natives likely won't understand friction and question it.
I’m not seeing this. And based on what we’re seeing at the university level, I’m not expecting to.
The analogy is unlimited typing in Gmail won’t make you a better writer or typesetter on its own.
This is a testable hypotheses with severe lack of citations. Intuition would argue the opposite. We learn by using our brains, if we offload the thinking to a machine and copy their output we don‘t learn. A child does not learn multiplication by using a calculator, and a language learner will not learn a new language by machine translating every sentence. In both cases all they’ve learnt is using a tool to do what they skipped learning.
1. AIs aren't yet good at architecture.
2. AIs aren't yet good at imagining technically exciting stuff to build.
And I agree that there's still space there to build a career in the short to medium term (plus Jevons Paradox). When both those points are no longer true we are certainly much closer to, dear I say it, agi. I suspect that (1) will be solved for somewhat limited domains in the near future using harnesses. And it could snowball from there.
I didn't think this 6 months ago but today after what I've seen these models debug and accomplish in established, messy production monoliths, I'm fully convinced even the worst vibe coders are only a year or two away from being able to actually create something from scratch and have it not blow up 50 files in.
So I guess I take the totally opposite stance, today's AI is the worst AI will ever be at coding, and I believe the vested interests behind AI do not plan on making it any worse at this task, so...
Better headline: "Why AI Multiplies Developer Skills Rather Than Replacing Them"
Not the most talented developer, but this has been pretty much my experience as well. Just keep it under control, know what and why its doing at every step, read the code, and then it will boost your productivity.
Maybe not the same agency you would expect from a human being, but if you put them in a ralph loop they can go far, far away, and mostly because on how we build our world in the pre-llm era: do you need to order something (or you want to hire a hitman)? -> you can go do it on a web site or via whatsapp or by calling some API.
The point is they mostly wind up somewhere stupid, and it takes expertise to spot and correct that. (Maybe that changes with further development.)
It's essentially a "brute force" approach, but in most cases, they only need to succeed once.
The article’s point is this is not true. They wind up in bullshit attractors where they hit a wall and then get lost within their muddled context window.
> they only need to succeed once
Yet they don’t. Not on their own. Like, you haven’t had an LLM get stuck in a stupid loop where you point out the flaw and then it gets unstuck?
I like to think of it as a normal distribution, the further away a programmer is to the right of the mean, the more their benefit. It's almost like it's their standard deviation squared (σ²). So someone like Matt Perry (as OP mentioned), who is a >99.99% programmer for argument's sake and is therefore four standard deviations away from the mean... Matt gets a (4×4) 16x multiplying effect on their productivity.
Someone who is a slightly above average programmer might see a 2 or 3x boost on their productivity, which is huge(!) and might also make them fear for their job. Which tracks with the level of moral panic we are seeing and experiencing. This math kinda still holds up for "bad programmers" too (i.e. left of the mean), as in they still see a boost to their productivity (negative squared is a positive number)... but there's something iffy about their results. The technical debt is unmaintainable and because they don't _understand_ the systems that they're operating in, they end up in the "3 hour" prompt loops that the OP refers to.
> Similarly, if Matt Perry handed me the keys to the Motion repository and told me to take over, I wouldn’t have the same results even though I have access to the same set of LLM tools.
The question is -- how long is this multiplier going to exist for? Some people would wager "for the foreseeable long-term future"; some people think it will widen further; and some people think it will diminish or god forbid even collapse. It feels like most arguments at the moment (like this article's) are that the humans who "know what they are doing" will be able to baton the hatches and avoid being usurped by ever-capable models. I saw it in a café yesterday: someone was using a coding agent to build a marketing website for their project, getting more and more frustrated by not getting the outcome they wanted. Their friend typed a couple of sentences on their keyboard and got a "Dude! How did you do that? That was sick!" a minute or so later. "I used to build websites" the friend said. -- The friend 'knew what they were doing'.
How much longer is knowing what you're doing going to be a moat?
For a looooonnnnngggg time, unless there's massive progress in AI research.
Fundamentally, next token prediction is limited. Granted, I'm pretty amazed at how well it's done, but if you can't activate the right parts of the models (with your prompts), then you're not going to get good results.
And to be fair, for lots of things this doesn't matter. Steve in Finance or Mindy in Marketing can create dashboards that actually help them, and the code quality mostly doesn't matter.
For stuff that needs to be shipped, monitored and maintained you still need to know what you're doing.
The question that really matters is whether that will continue to be the case. My guess is that technical expertise matters less over time, and the ability to specify the desired outcome is eventually the only thing that becomes important. But I could be wrong! The direction this all goes is pretty fuzzy in my mind.
To me, I don't see how this will ever not be an advantage. All software requires constraints. Some of those constraints might be objective (scale, performance, etc.) but a lot of them are subjective and require active decision making (architecture, UI, readability).
So if there was only one way to do something or only one desired output, then yes I think models would surpass humans. But like art, I don't think there is a objective truth to software and because of that, humans get the opportunity to play an important role.
Now whether that is valued from a business/industry perspective is a question that I think we all know the answer to unfortunately.
Everything these days is either the greatest thing ever or the worst thing ever. All the stuff in the middle has vanished. Very few it seems acknowledge AI as being a useful tool. It's either "We're all being replaced" or "The technology is all slop" and everyone talks over each other like it's the Super Bowl and their teams are battling it out.
It would be nice if we could just look to the opportunities this tech offers and focus on that.
You cannot hold a computer liable for any of those reasons. You can, however, sue the human that built or used the AI. So those concerns shoudn't be any different with or without AI. The same problems will be here either way. If you really care about those problems, you would demand your representatives in government actually enshrine those things in law, with some teeth, to ensure companies prevent problems with them. If you don't do something about those problems (with or without AI), then it's clear by your actions that ethical/environmental/safety concerns aren't actually that important to you.
I've found I can prevent the LLM, in many cases, from thrashing on a bug/feature for long periods of time by switching into plan mode and, even in the middle of a conversation, having it reassess the structure around the problem, first. If you keep prompting about the same bug, it may keep producing variations of the problem code. But forcing it to stop and 'think' for a bit, has yielded much better results.
I used to be a PM and am technically literate enough but can only very minimally write code. I have been using LLMs to build (or try to, at least) internal tools for my business since GPT-4.
In the early days, I'd get a little ways, then the LLM would start breaking things, and I'd try but fail to get it to fix things. But over successive generations, I was increasingly able to get it unstuck by offering suggestions on where it may have gone wrong. With Opus 4.7, I don't even really have to do that - if something isn't working it's usually sufficient to just tell it what's broken. It can figure out how to fix it without my input. And of course fewer things are broken in the first place.
So I think I'm very well positioned to understand how these things are improving - better able to get the LLM to do what I want than the post OP quoted from /vibecoding (though I am 99% sure that post is actually AI slop), but less so than most of the people posting in this thread. As they've improved, whatever ability I have to guess at the causes of problems based on my experience having seen things go wrong with products I've PMed has become less necessary to getting the right outcome.
I expect that trend to continue - increasingly the LLM won't need the guidance of people with a great deal of technical expertise. I basically no longer have to attempt to diagnose problems in order to get them fixed, though with the caveat that I am building internal tools for which I am the only user, so certainly much simpler in scope than the stuff OP is talking about.
> Without guidance, LLMs tend to paint themselves into a corner, because they’re generating code to solve individual prompts, not thinking holistically about an application’s architecture.
The crux of what I'm trying to say here is that I absolutely believe that this line is 100% true today, but I would be deeply cautious about assuming that it will continue to be true given the improvements in LLMs over the past few years.
Seemingly every AI pilled programmer who writes a blog post on AI's impact on software engineering has the same philosophical argument, and it's wording changes slightly every 6-12 months to reflect the newest models capabilities.
In 2023 it was: "AI is just autocomplete. It can't code whole blocks on it's own."
In 2024 it was: "AI is only good for scaffolding new projects, or boiler plate code. It can't write the application whole sale."
Since November 2025 it's been: "AI is only writing the code for us. It can't manage architecture, or do the long term planning required for real world applications."
In 6-12 months when the AI is doing an increasing amount of the architecture and high level planning, what will AI pilled programmers fall back on then?
When you see rising inequality, don't just cheer because you happen to win for now.. maybe think about the future and also others..
There's been a massive shift since the release of opus 4.5 just 6 months ago - it's wild to make big claims on what AI can or can't do.
We just don't know.
AIs have skills humans aren’t good at like nerding out on technical details.
That’s not a perfect map because I’m spitballing. However there is a symbiosis.
I am not sure I am productive anymore with AI as I am up to 125 repos and agents most of which are tools for managing AIs and things break frequently that it feels like spinning plates.
I spent two months in November and December last year writing by hand a fundamental library to constrain how the AIs build clis. That did make things move a lot faster but for those two months I felt the slowness.
I think it will always be like this. It’s the nature of paradigm shift to shift.
- Lesser overall engineers needed -> lesser demand of human engineers -> lower compensations
- insufficient training at junior levels.
- longer time to productive human engineering skill.
These are playing out right now, and a concern for all engineers in the industry. IronMan amplification don't address the above
This sentiment will stray further from the truth as time goes on.
Sure, it's a multiplier for those who are already skilled, but for those who are unskilled, it is capable of taking you from 0 -> 1+.
The ones currently benefiting from AI are the ones who (i) have a general understanding of how an AI works and experience with using it and (ii) have a very generic understanding of what it is they're trying to do (programming, most likely) and know the limits of their tools, but don't know how to actually do anything meaningful.
The whole point of AI is to open the door of complexity to normies; they are the ones benefiting most from it. For a skilled developer, it may make a 1hr task -> 5 mins; for a normie, it makes something which was utterly impossible into -> now within his reality to achieve. the difference for normies is just more life-changing.
If you think of skilled developers as the ceiling and normies as the floor, AI raises the floor higher by giving normies more capability, which makes the ceiling seem less impressive. But eventually the floor will surpass the ceiling, and then it'll be a matter of who can operate AI better/how good AI is.
What’s not clear to me is: if writing more code per engineer is possible, does that result in fewer engineers or just more software, especially in areas that traditionally got squeezed: UX, testing, DevEx, documentation, etc. Perhaps the bar just gets raised?
The problem is just that the question is not whether "human developers will be necessary in the near future", it's "how many human developers will be necessary in the near future" - managers wanting to exploit the efficiency gains by deciding that fewer developers can now do more work "thanks" to AI.
Those are people who weren't making it to the MVP stage before LLMs.
There is no doubt that highly technical people are getting A LOT more out of LLMs than people without dev experience, in an absolute sense. I think it's less clear in a relative sense.
A question I also ask myself a lot: What are the skills I'm leveraging, exactly, as a highly experienced developer that's now doing a lot of vibe coding?
1) I'm choosing good technology for the task, and thinking about what LLM-agents are good at and choosing technology that they can work well with.
2) I'm choosing good workflows for the LLM-agent, starting a new context at the right time, having it test things, making sure it has logging that it can inspect, making sure it can operate the application in a way that it can debug and inspect it.
3) I'm thinking about the code even though I'm not looking at it, I'm telling it how I want things implemented, I'm telling it how to debug things.
I think these are all hard things for non-developers to do, but I also think non-developers will be able to replicate a large chunk of #1 and #2 relatively quickly. I only have to figure out that it's valuable to tell the LLM-agent to use playwright when working on web page visuals once, and then I can tell you to do that too. Or the coding agents will come with that knowledge built-in (to the model or as a builtin skill or whatever). Knowledge around this will accumulate and become easier for non-developers to access, and in many cases be builtin to the models or harnesses.
voidUpdate•49m ago
Someone needs to watch iron man 3...
acbart•48m ago