5 years ago: ML-auto-complete → You had to learn coding in depth
Last Year: AI-generated suggestions → You had to be an expert to ask the right questions
Now: AI-generated code → You should learn how to be a PM
Future: AI-generated companies → You must learn how to be a CEO
Meta-future: AI-generated conglomerates → ?
Recently I realized that instead of just learning technical skills, I need to learn management skills. Specifically, project management, time management, writing specifications, setting expectations, writing tests, and in general, handling and orchestrating an entire workflow.And I think this will only shift to the higher levels of the management hierarchy in the future. For example, in the future we will have AI models that can one-shot an entire platform like Twitter. Then the question is less about how to handle a database and more about how to handle several AI generated companies!
While we're at the project manager level now, in the future we'll be at the CEO level. It's an interesting thing to think about.
one-shot means you provide one full question/answer example (from the same distribution) in the context to LLM.
The cost of a model capable of running an entire company will be multiples of the market cap of the company it is capable of running.
Also you're forgetting the decreasing cost of AI, as well as the fact that you can buy a $10k Mac Studio NOW and have it run 24/7 with some of the best models out there. Only costs would be the initial fixed cost and electric (250W at peak GPU usage).
AI is still being heavily subsidized. None of the major players have turned a profit, and they are all having to do 4D Chess levels of financing to afford the capex.
I bring the table, AI brings the value.
Basically I don't see how you can be an AI maximalist and a capitalist at the same time. They're contradictory, IMO.
If you become just a manager, you don't have answers to these questions. You can just ask the AI agent for the answer, but at that point, what value are you actually providing to the whole process?
And what happens when, inevitably, the agent responds to your question with "You're absolutely right, I didn't consider that possibility! Let's redo the entire project to account for this?" How do you communicate that to your peers or clients?
This is the kind of half baked thought that seems profound to a certain kind of tech-brained poster on HN, but upon further consideration makes absolutely zero sense.
@dang
If anything, managing the project, writing the spec, setting expectations and writing tests are things llms are incredibly well suited for. Getting their work 'correct' and not 'functional enough that you don't know the difference' is where they struggle.
I like his thinking but many professional managers are not good at management. So I'm not sure about the assumption that "many people" can easily pick this up.
"AI labs"
Can we stop this misleading language. They're doing product development. It's not a "laboratory" doing scientific research. There's no attempt at the scientific method. It's a software firm and these are software developers/project managers.
Which brings me to point 2. These guys are selling AI tooling. Obviously there's a huge desire to dogfood the tooling. Plus, by joining the company, you are buying into the hype and the vision. It would be more surprising if they weren't using their own tools the whole time. If you can't even sell to yourself...
I don't know why you're trying to suggest some kind of restriction on the word "lab", or based on what. But calling them "labs" is perfectly normal, conventional, and justified terminology.
We've been running agent workflows for a while now. The pattern that works: treat agents like junior team members. Clear scope, explicit success criteria, checkpoints to review output. The skills that matter are the same ones that make someone a good manager of people.
pglevy is right that many managers aren't good at this. But that's always been true. The difference now is that the feedback loop is faster. Bad delegation to an agent fails in minutes, not weeks. You learn quickly whether your instructions were clear.
The uncomfortable part: if your value was being the person who could grind through tedious work, that's no longer a moat. Orchestration and judgment are what's left.
Patently shocked to find this on profile:
> I lead AI & Engineering at Boon AI (Startup building AI for Construction).
I disagree. Part of being a good manager of (junior) people is teaching them soft skills in addition to technical skills -- how to ask for help and do their own research, and how to build their own skills autonomously, how to think about requirements creatively, etc.
Clear specifications and validating output is only a part of good people management, but is 100% of good agent management.
Actually you can. Training data, then rhe way you describe the task, goals, checkpoints, etc
Yes, it's a crutch. But maybe the whole NNs that can code and we don't really know why is too.
Here's the thing - that feedback loop isn't a magic lamp. Actually understanding why an agent is failing (when it does) takes knowledge of the problem space. Actually guiding that feedback loop so it optimally handles tasks - segmenting work and composing agentic cores to focus on the right things with the right priority of decision making - that's something you need to be curious about the internals for. Engineering, basically.
One thing I've seen in using these models to create code is that they're myopic and shortsighted - they do whatever it takes to fix the problem right in front of them when asked. This causes a cascading failure mode where the code is a patchwork of one-off fixes and hardcoded solutions for problems that not only recur, they get exponentially worse as they compound. You'd only know this if you could spot it when the model says something like "I see the problem, this server configuration is blocking port 80 and that's blocking my test probes. Let me open that port in the firewall".
Actually I disagree. I've been experimenting with AI a lot, and the limiting factor is marketing. You can build things as fast as you want, but without a reliable and repeatable (and at least somewhat automated) marketing system, you won't get far. This is especially because all marketing channels are flooded with user-generated content (UGS) that is generated by AI.
But you can also think what would you want to build (for yourself or someone you know), that would otherwise take a team of people. Coding what used to be a professional app can now be a short hobby project.
I played with Claude Code Pro only a short while, but I already believe the mode of production of SW will change to be more accessible to individuals (pro or amateur). It will be similar to death of music labels.
Where are yall working that "writing code" was ever the slow part of process
What kind of work do you think people who deal with LLMs everyday are doing? LLMs could maybe take something 60% of the way there. The remaining 40% is horrible tedious work that someone needs to grind through.
You still need to do most of the grunt work, verifying and organizing the code. it's just you're not editing the code directly. Speed of typing out code is hardly the bottle neck.
The bottleneck is visualizing it and then coming up with a way to figure out bugs or add features.
I've tried a bunch of agents, none of them can reasonably conduct a good architectural change in a medium size codebase.
I feel like this was always true. Business still moves at the speed of high-level decisions.
> The uncomfortable part: if your value was being the person who could grind through tedious work, that's no longer a moat.
Even when junior devs were copy-pasting from stackoverflow over a decade ago they still had to be accountable for what they did. AI is ultimately a search tool, not a solution builder. We will continue to need junior devs. All devs regardless of experience level still have to push back when requirements are missing or poorly defined. How is picking up this slack and needing to constantly follow up and hold people's hands not "grinding through tedious work"?
AI didn't change anything other than how you find code. I guess it's nice that less technical people can now find it using their plain english ramblings instead of needing to know better keywords? AI has arguably made these search results worse, the need for good docs and examples even more important, and we've all seen how vibecoding goes off the rails.
The best code is still the least you can get away with. The skill devs get paid for has always been making the best choices for the use case, and that's way harder than just "writing code".
Similarly, it’s easy to think that the lowly peons in the engineering world are going to get replaced and we’ll all be doing the job of directors and CEOs in the future, but that doesn’t really make sense to me.
Being able to whip your army of AI employees 3% better than your competitor doesn’t (usually) give any lasting advantage.
What does give an advantage is: specialized deep knowledge, building relationships and trust with users and customers, and having a good sense of design/ux/etc.
Like maybe that’s some of the job of a manager/director/CEO, but not anyone that I’ve worked with.
At least ChatGPT, Gemini and Claude told me it was. I did so many rounds of each one evaluating the other, trying to poke holes etc. Reviewing the idea and the "research", the reasoning. Plugging the gaps.
Then I started talking to real people about their problems in this space to see if this was one of them. Nope, not really. It kinda was, but not often enough to pay for a dedicated service, and not enough of a pain to move on from free workarounds.
Beware of AI reviewing AI. Always talk to real people to validate.
I once solved a Leetcode problem kind of unorthodox and both ChatGPT and Gemini both said it was wrong in the same way. Then I asked both of them to give me a counter example and only Gemini was able to realize the counter example would have actually worked.
Aurornis•2h ago
In my experience so far, AI prototyping has been a powerful force for breaking analysis paralysis.
In the last 10 years of my career, the slow execution speed at different companies wasn't due to slow code writing. It was due to management excesses trying to drive consensus and de-risk ideas before the developers were even allowed to write the code. Let's circle back and drive consensus in a weekly meeting with the stakeholders to get alignment on the KPIs for the design doc that goes through the approval and sign off process first.
Developers would then read the ream and realize that perfection was expected from their output, too, so development processes grew to be long and careful to avoid accidents. I landed on a couple teams where even small changes required meetings to discuss it, multiple rounds of review, and a lot of grandstanding before we were allowed to proceed.
Then AI comes along and makes it cheap to prototype something. If it breaks or it's the wrong thing, nobody feels like they're in trouble because we all agree it was a prototype and the AI wrote it. We can cycle through prototypes faster because it's happening outside of this messy human reputation-review-grandstanding loop that has become the norm.
Instead of months of meetings, we can have an LLM generate a UI and a backend with fake data and say "This is what I want to build, and this is what it will do". It's a hundred times more efficient than trying to describe it to a dozen people in 1-hour timeslots in between all of their other meetings for 12 weeks in a row.
The dark side of this same coin is when teams try to rely on the AI to write the real code, too, and then blame the AI when something goes wrong. You have to draw a very clear line between AI-driven prototyping and developer-driven code that developers must own. I think this article misses the mark on that by framing everything as a decision to DIY or delegate to AI. The real AI-assisted successes I see have developers driving with AI as an assistant on the side, not the other way around. I could see how an MBA class could come to believe that AI is going to do the jobs instead of developers, though, as it's easy to look at these rapid LLM prototypes and think that production ready code is just a few prompts away.
Exoristos•2h ago
So is an 8-ball.
cheschire•1h ago
gyanchawdhary•1h ago
layer8•1h ago
gyanchawdhary•1h ago
alexhans•1h ago
Since shipping prototypes doesn't actually create value unless they're in some form of production environment to effect change, then either they work and are ZeroOps or they break and someone needs to operate on them and is accountable for them.
This means that at some point, your thesis of
"The dark side of this same coin is when teams try to rely on the AI to write the real code, too, and then blame the AI when something goes wrong" won't really work that way but whoever is accountable will get the blame and the operations.
The same principles for building software that we've always have apply more than ever to AI related things.
Easy to change, reusable, compostable, testable.
Prototypes need to be thrown away. Otherwise they're trace bullets and you don't want to have tech debt in your tracer bullets unless your approach is to throw it to someone else ans make it their problem.
-----
Creating a startup or any code from scratch in a way that you don't actually have to maintain and find out the consequences of your lack of sustainable approaches (tech debt/bad design/excessive cost) is easy. You hide the hardest part. It's easy to do things that in surface look good if you can't see how they will break.
The blog post is interesting but, unless I've missed something, it does gloss over the accountability aspect. If you can delegate accountability you don't worry about evals-first design, you can push harder on dates because you're not working backwards from the actual building and design and its blockers.
Evals (think promtpfoo) for evals-first design will be key for any builder who is accountable for the decisions of their agents (automation).
I need to turn it into a small blog post but the points of the talk https://alexhans.github.io/talks/airflow-summit/toward-a-sha...
- We can’t compare what we can’t measure
- Can I trust this to run on its own?
Are crucial to have a live system that makes critical decisions. If you don't, have this, you're just using the --yolo flag.
chunky1994•1h ago
This is what's missing in most teams. There's a bright line between throwaway almost fully vibe-coded, cursorily architected features on a product and designing a scalable production-ready product and building it. I don't need a mental model of how to build a prototype, I absolutely need one for something I'm putting in production that is expected to scale, and where failures are acceptable but failure modes need to be known.
Almost everyone misses this in going the whole AI hog, or in going the no-AI hog.
Once I build a good mental model of how my service should work and design it properly, all the scaffolding is much easier to outsource, and that's a speed up but I still own the code because I know what everything does and my changes to the product are well thought out. For throw-away prototypes its 5x this output because the hard part of actually thinking the problem through doesn't really matter its just about getting everyone to agree on one direction of output.
ryandrake•1h ago