Mind sharing which points were pretty rough? I don't want the message to get lost in the "oof this is AI slop" mess, so always interested in what might be hitting folks the wrong way so I can adapt.
> My process is to dictate and word vomit all of my thoughts to AI (in this case Gemini 2.5 Pro) and then refine from there. I know there's some amount of "smells like AI" in the writing, but at this point I don't think it takes away from the lessons shared. That said, maybe it's worth it to modify things so there is less "AI sounding" verbiage. Don't want folks to write it off as "AI Slop" because of certain phrases. TBD!
I'm not even going to bother reading the rest. Just give me the bullet points you fed the LLM at that point
That said, maybe it's worth it to modify things so there is less "AI sounding" verbiage. Don't want folks to write it off as "AI Slop" because of certain phrases. TBD!
However it’s also the direction given to the AI enhancement - it’s steered toward buzzword/hype style.
“I managed a swarm of AI agents…”? How about instead just “I ran multiple instances of Claude Code and the results seem promising”? No swarms necessary.
In my experience, letting LLMs work autonomously just produces code slop.
Model matters too. Lots of what I did was with Opus. Significantly better than Sonnet.
Also, I have no desire to micromanage a bunch of robots. I rather build things.
I've primarily stayed working with one, but sometimes two, to first understand how to manage one, more like a pair-programming setup, but also see if they can be used more like a managed developer. They still do too many dumb things to let them loose, I have my doubts this will change without significant change to the underlying models.
Also, you need to spend the time to read what they wrote and check not just for correctness and test passing, but is this even the right approach. They misinterpret the question or task, even when spelled out painful detail. They lose their focus or get caught in loops.They love to write everything from scratch instead of using the helper packages sitting right there. It's like the bell curve flattened out, they are equally impressive as they are astonishingly moronic, context dependent (pun intended)
I gotta say, nearly every thing I've built both at work and at home ends up being another tool I have to end up micro-managing to some degree or another.
So does any template off GitHub?
When you run the Laravel Installer, you got pretty much all of these for free (sans LLM) And within a normal work week, you can have a valid prototype with an admin panel (Laravel Nova), and a lot of stuff that the laravel ecosystem provides. The issue is not coding those stuff (they've already been coded for you). The issue is to know what exactly do you want that is not part of some templates, the actual business value of the software.
This is a serious question, I'm not being snarky - do you run your comments through AI before posting them here?
Yes, if you want code to be maintainable over time, or to be able to evolve as a coherent foundation that lends itself to a long term direction and vision.
FWIW, the vision and direction can and should still be dictated by the human engineer. That's what engineers will be doing IMO vs. coding.
> The cognitive load of this new state was immense. After about three hours of intense orchestration, I would feel completely burnt.
I definitely feel that. It's like the job becomes solely architecting the system (not terrible) but you still need to hold the entire program in your head (deeply taxing) while it is updated by a 1,000 tiny cuts which can dislodge your mental model... and then you are screwed. It is important center yourself from time to time and ensure you are on solid ground. The best way to do that is to give things a minute to sink in. Unless I am boilerplating everything, I don't WANT to go faster.
In general, I don't have a lot of desire to go down this orchestrator path. Maybe once tooling and things settle down a bit it can be helpful at times, but usually I get more value from letting things breathe, especially if there are structural changes or new ideas that bubble up and need to be considered.
Before LLMs, I would never re-architect on the fly, but I am more willing to make structural changes mid-flight now because it is so costless. Things I used to architect "good enough" (with a plan to revise in 12 months now) can be done precisely right the first time after going halfway down a path then noodling on it for an a day or so.
This is "Not the Way to Architect," but it sure is an effective way to go from idea -> complete.
one of the weirdest things about LLMs now being cheap and easily available is finding out how many people have no pride in their work, and will just "send a PR to the curl team without understanding it" or "publish some crap blog post, on their personal blog, and then push it to HN".
why?
it's great that you found a tool that can generate text for you to pass off as your own, but why did you then stop giving a fuck about what you're passing off as your own?
Maybe you should contribute some of your ideas vs. bashing others who do?
I’m not claiming vibe coded projects don’t have a place. I’m skeptical that using English prompts can build a maintainable code base as well as using a programming language.
zachwills•2h ago
I spent last week in a deep-dive experiment to see how far I could push modern agentic workflows on a greenfield project. I wanted to move past simple code generation and see if I could build a system where I was orchestrating a team of agents to build a full application.
The results were pretty wild (~800 commits, 100+ PRs, and a functioning app we use internally at my company), but the most interesting part was the playbook of rules I had to develop to make it work. The post covers the 8 rules I learned, from managing the AI's context window with sub-agents and manual checkpoints, to creating autonomous test loops, to why I had to become ruthless about restarting failed runs.
A few quick notes to preempt questions:
Tech Stack: The core of this was Claude Code, a custom parallelization script, and open-source MCPs like Serena.
Cost: The token cost was significant (~$6k). This was an experiment to push the limits, not to optimize for cost efficiency... yet.
Effort: This was not a standard 40-hour week. It was an intense, "in the hole" sprint with a very high cognitive load.
I’m convinced the role of an engineer is shifting from a hands-on coder to an architect of these intelligent systems. I’m curious to hear how others are approaching this. What workflows or tools for managing agents have you found to be effective?
gjsman-1000•1h ago
esafak•1h ago
zachwills•1h ago
giancarlostoro•1h ago
I think this is what a lot of people do and don't understand. The AI is only as good as your ability to architect the software. I think I'd be okay if someone wrote AI generated code if they hand wrote all unit testing to ensure that the AI isn't producing garbage, I think this is a reasonable trade.
zachwills•1h ago
risyachka•1h ago
zachwills•1h ago
pavel_lishin•1h ago
I'd be very interested to hear less about the process, and more about the app itself.
If a week "in the hole" resulted in a Slackbot that greets people when they enter a Slack channel, I'd be significantly less impressed than if you'd built, say, a CI pipeline, or something that automatically creates Jira tickets based on outages, or automatically handles subscription renewals or something.
Plus, the number of commits & PRs is absolutely not a useful metric. How well is the app running? How many bugs are you finding day to day? How much functionality is missing, how easy is it to add new functionality based on user feedback, etc? Monitoring?
zachwills•1h ago
pavel_lishin•50m ago
Again, like my other comment, I'm not being snarky - but it feels weird and inauthentic to be thanked for contributing to the discussion, and sounds like you ran my comment through ChatGPT as well and asked it to post a response.
But the end result indeed sounds like more than just a trivial app, so I am genuinely impressed it's working well. (I'd love to see how long it would take another team of engineers, or another single engineer I guess, to re-create with the same functionality.)
zachwills•4m ago
esafak•1h ago
zachwills•1h ago