My team is incredibly clueless and complacent. I can't even get them to use TypeScript or to migrate from Yarn v1.
I don't see how that would change if you accept the premise that code is now a commodity.
In these cases, I just read the main point behind in this case is "create a way for devs to share context when working with AI".
Some recent techniques claim to be solving this problem but none reached a release yet.
Working with what we have now, this is a recipe for disaster. Agents often lies about the outputs. The shorter the context space they have to manage while the bigger the data already in context makes it prone to lie and deceive.
It works ok for small changes on top of human code. That's what we know works now. The rest is more yet to be reached
Would prefer if 2028 models are concise and generates perfect refactors.
Plan before you code. Now your plan is just in a prompt.
For this, which summarises vibe coding and hence the rest of the article, the models aren't good enough yet for novel applications.
With current models and assuming your engineers are of a reasonable level of experience, for now it seems to result in either greatly reduced velocity and higher costs, or worse outcomes.
One course correction in terms of planned process, because the model missed an obvious implication or statement, can save days of churning.
The math only really has a chance to work if you reduce your spend on in-house talent to compensate, and your product sits on a well-trodden path.
In terms of capability we're still at "could you easily outsource this particular project, low touch, to your typical software farm?"
If I read stuff like that, I wonder what the F they are doing. Agents work overnight? On what? Stuck in some loop, trying to figure out how to solve a bug by trial and error because the agent isn't capable of finding the right solution? Nothing good will come out of that. When the agent clearly isn't capable of solving an issue in a reasonable amount of time, it needs help. Quite often, a hint is enough. That, of course, requires the developer to still understand what the agent is doing. Otherwise, most likely, it will sooner or later do something stupid to "solve" the issue. And later, you need to clean up that mess.
If your prompt is good and the agent is capable of implementing it correctly, it will be done in 10 minutes or less. If not, you still need to step in.
I can see overnight for a prototype of a completely new project with a detailed SPEC.md and a project requirements file that it eats up as it goes.
I wonder how our comments will age in a few years.
Edit: to add
> Review the output, not the code. Don't read every line an agent writes
This can't be a serious project. It must be a greenfield startup that's just starting.
Badly. While I wouldn't assign a task to an LLM that requires such a long running time right now (for many reasons: control, cost etc) I am fully aware that it might eventually be something I do. Especially considering how fast I went from tab completion to whole functions to having LLMs write most of the code.
My competition right now is probably the grifters and hustlers already doing this, and not the software engineers that "know better". Laughing at the inevitable security disasters and other vibe coded fiascos while back-patting each other is funny but missing the forest for the trees.
I don't think there will be a future where agents need to work on a limited piece of code for hours. Either they are smart enough to do it in a limited amount of time, or someone smarter needs to get involved.
> This can't be a serious project. It must be a greenfield startup that's just starting.
I rarely review UI code. Doesn't mean that I don't need to step in from time to time, but generally, I don't care enough about the UI code to review it line-by-line.
Humans are not the only thing initiating prompts either. Exceptions and crashes coming in from production trigger agentic workflows to work on fixes. These can happen autonomously over night, 24/7.
Admittedly, I have never tried to run it that long. If 10 minutes are not enough, I check what it is doing and tell it to do what it needs to do differently, or what to look at, or offer to run it with debug logs. Recently, I have also had a case where Opus was working on an issue forever, fixing one issue and thereby introducing another, fix that, only for the original issue to disappear. Then I tried out Codex, and it fixed it at first sight. So changing models can certainly help.
But do you really get a good solution after running it for hours? To me, that sounds like it doesn't understand the issue completely.
This approach breaks the moment you need to provide any form of feedback, of course.
To be clear, this is not a hypothetical situation. I wrote long specs like that and had large chunks of services successfully implemented up to around 2h real-time. And that was limited by the complexity of what I needed, not by what the agent could handle.
I don't *yet* subscribe to the idea of "code is context for AI, not an interface for a human", but I have to admit that the idea sounds feasible. I have many examples of small-to-mid size apps (local use only) where I pretty much didn't even look at the code beyond checking that it doesn't do anything finicky. There, the code doesn't mater because I know that I can always regenerate it from my specs, POC-s, etc. I agree that the paradigm changes completely if you look at code as something temporary that can be thrown away and re-created when the specification changes. I don't know where this leads to and if this is good or not for our industry, but the fact is - it is feasible.
I would never use this paradigm for anything related to production, though. Nope. Never. Not in the foreseeable future anyway.
> Everyone uses their own IDE, prompting style, and workflow.
In my experience with recent models this is still not a good idea: it quickly leads to messy code where neither AI nor human can do anything anymore. Consistency is key. (And abstractions/layers/isolation everywhere, as usual).
IDE - of course. But, at the very least, I would suggest using the same foundation model across the code base, .agent/ dirs with plenty of project documents, reusable prompts, etc.
--
P.S. Still not sure what does the 10AM rule bring, though...
I'm very much pro AI for coding there are clearly significant capabilities there but I'm still getting my head around how to best utilise it.
Posts like these make it sound like ruthlessly optimizing your workflow letting no possible efficiency go every single day is the only way to work now. This has always been possible and generally not a good idea to focus on exclusively. There's always been processes to optimise and automate and always a balance as to which to pursue.
Personally I am incorporating AI into my daily work but not getting too bogged down by it. I read about some of the latest ideas and techniques and choose carefully which I employ. Sometimes I'll try and AI workflow and then abandon it. I recently connected Claude up to draw.io with an MCP, it had some good capabilities but for the specific task I wanted it wasn't really getting it so doing it manually was the better choice to achieve what I wanted in good time.
The models themselves and coding harnesses are also evolving quickly complex workflows people may put together can quickly become pointless.
More haste, less speed as they say!
I’ve had a lot of success dogfooding my own product, the Mermaid Studio plugin for JetBrains IDEs (https://mermaidstudio.dev).
It combines the deep semantic code intelligence of an IDE with a suite of integrated MCP tools that your preferred agent can plug into for static analysis, up to date syntax, etc.
I basically tell Claude Code to run the generated diagram through the analysis tool, fix issues it detects and repeat until fixed. Then generate a png or svg for a visual inspection before finalizing the diagram.
Genuinely seeking answers on the following - if you’re working that way, what are you “understanding” about what’s being produced? Are you monitoring for signal that points out gaps in your spec which you update; code base is updated, bugs are fixed and the show goes on? What insights can you bring to how the code base works in reality?
Not a sceptic, but thinking this stuff through ain’t easy!
Literally every single point in the article was good engineering practice way before AI. So it's either amnesia or simple ignorance.
In particular, "No coding before 10am" is worded a bit awkward, as it simply means "think before you write code", which... Does it need an article for saying it?
Not for nothing but The Art of War includes really insightful quotes like "If you do not feed your soldiers, they will die."
wesselbindt•1h ago
This seems entirely backwards. Why spend money to optimize something that _isn't_ the bottleneck?
Towaway69•1h ago
If we take that to its logical conclusion, I think we can answer that question.
Getting rid of humans, unfortunately, also takes away their earnings and therefore their ability to purchase whatever product you are developing. The ultra rich can only purchase your product so often - hence better make it a subscription model.
So there is pressure on purchasing power versus earnings. Interesting to see what happens and why.
PunchyHamster•33m ago
swiftcoder•1h ago
walterbell•1h ago
Davidzheng•1h ago
Cyphase•1h ago
gozzoo•1h ago
JimDabell•50m ago