I'm a product manager and I was talking to my dev lead yesterday about this very thing. If PMs are like headlights on a car and devs are like the engine, then we're going from cars that max at 80mph to cars that push past 600mph, and we're headed toward much faster than that. The headlights need to extend much further into the future to be able to keep the car from repeatedly running into things.
That's the challenge we face. To paraphrase Ian Malcolm, we need to think beyond what we can build to consider more deeply what we should build.
I have yet to see evidence that this is really the case. Already 15 years ago, people were creating impressive software over the course of a hackday, by glueing open source repos together in a high level language. Now that process has been sped up even more, but does it matter that much if the prototype takes 4 or 24 hours to make? The real value is in well-thought-out, highly polished apps, and AFAICT those still take person-years to complete.
An example would be, when taking photos these days since we have digital cameras I tend to take a lot of them, and then cull afterwards. Vs with film where I used to take way less. So now you have the whole culling phase where you need different tools, different skills, and also where you risk not putting the same amount of effort in photos in the first place and ending up with way more photos, none of them good (the so-called "slop"). Still, it is possible to take more and better pictures than before with digital cameras.
To your point: I think we're a fair bit away from Roomba-style (run in a single direction until you bump into something, then back up, turn, and repeat) development, but it doesn't seem impossible.
Development is nothing like driving a car. It makes no sense to liken it to driving a car. There's no set route to go, no road to follow, nor a person solely driving it.
Quick math on the environmental impact of this assuming 18.35Wh/1000 tokens:
Total energy: 4.73GWh, equivalent of powering 450 average US homes annually
Carbon footprint: ~1822 metric tons of CO2, equivalent of driving 4.56 million miles in a gas powered car
Water consumption: 4.5 million litres, recommended daily water intake for 4000 people for a full year
Yet they're on twitter bragging...
EDIT: Typos
I look around and everything seems to be... the same? Apart from the availability of these AI tools, what has meaningfully changed since 2020?
Hell, the GP spent more than $50,000 this year on API calls alone and the results are... what again? Where is the innovation? Where are the tools that wouldn't have been possible to build pre-ChatGPT?
I'm constantly reminded of the Feynman quote: "The first principle is that you must not fool yourself, and you are the easiest person to fool."
https://www.tomshardware.com/tech-industry/artificial-intell...
https://app.powerbi.com/view?r=eyJrIjoiZjVmOTI0MmMtY2U2Mi00Z...
Adjusted beef consumption: 4.5 million litres of water can be used to produce 300kg of beef -> US (highest beef consumer/capita) consumes 23.3kg of beef , enough to feed ~13 Americans (30 Brits, ~43 Japanese) yummy delicious grass-fed beef yearly!
Neither the cow nor the cow's food retains much water; the water is merely delayed a little in its journey to the local watershed, and in vast parts of the US, local rainfall is adequate for this purpose (power irrigation isn't required for the crops, and cattle may drink from a pond.) Even if a cow drinks pumped well water, the majority of its nourishment will have been itself sustained by local natural rainfall.
A datacenter's use of water over any timescale can hardly be compared with a cow's.
Sounds like that's more than a junior or an intern that would have cost twice as much in fully loaded cost.
It's also more than hiring someone overseas esp just for a few months. Honestly it's more than most interns are paid for 3 months outside FAANG (considering housing is paid for there etc)
2. Yeah I mentioned that also.
3. It's still more expensive than hiring a contractor esp abroad, even all in.
You're not addressing the idea in the post, but the person. What the author achieved personally is irrelevant. What's important is that a very large and important industry is being completely transformed.
Add: if I did it without LLMs it would have taken me a year, and would have been less complete.
Is a really weird way to say that you built a native compiler for typescript.
Well, the Typescript team is rewriting their compiler (landing in v7) in go, and some ppl call it the native compiler. I think my statement is clearer? But then English is not my first language, so there's that.
I do want better workflows where the AI thinking, where the transcript is captured. Being able to go back and understand what just happened is the major delay. And that cost increases day by day week by week, especially if the session where generation was done is lost.
This is my experience now too. The degree to which we are bottlenecks comes down to how good we are at finding the right balance between micromanaging the models (doesn't work well - massive maste of time; most of the issue you spend time correcting are things the models can correct themselves) vs. abandoning all oversight (also does not work well; will entrench major architectural problems that will take lots of effort to fix).
I spend a fairly significant amount of time revising agents, skills etc. to take myself out of the loop as much as possible by reviewing what has worked, and what doesn't, to let the model fix what it can fix before I have to review its code. My experience is that this time has a high ROI.
It doesn't matter if the steps I add waste lots of the models time cleaning up code I ultimately end up rejecting, because its time is cheap, and mine is not, and the cleanups also tend to make the time it takes to realise its done something stupid shorter.
Getting to a point where I'm comfortable "letting go" and letting the model write stupid code and letting the model fix it, before I even look at it, has been the hardest part for me of accelerating my AI use.
If I keep reading as Claude Code runs, the model often infuriates me and I end up starting to type messages to tell it fix something tremendously idiotic it has just done, only to have it realise and fix it before I get to pressing enter. There's no point doing that, so increasingly I put my sessions on other virtual desktops and try to forget about them while they're working.
It still does stupid stuff, but the proportion of stupid stuff I need to manually review and reject keeps dropping.
Modern LLMs are amazing for writing small self contained tools/apps and adding isolated features to larger code bases, especially when the problem can be solved by composing existing open source libraries.
Where they fall flat is their lack of long term memory and inability to learn from mistakes and gain new insider knowledge/experience over time.
The other area they seem to fall flat is that they seem to rush to achieve their immediate goal and tick functional boxes without considering wider issues such as security, performance and maintainability. I suspect this is an artefact of the reinforcement learning process. It’s relatively easy to asses whether a functional outcome has been achieved, while assessing secondary outcomes (is this code secure, bug free, maintainable and performant) is much harder.
As a developer you would take that and break it down to a design and smaller tasks that can show incremental progress and give yourself a chance to build feature Foo, assess the situation and refactor or move forward with feature Bar.
Working with an LLM to build a full featured application is no different. You need to design the system and break down the work into smaller tasks for it to consume and build. It and you can verify the completed work and keep track of things to improve and not repeat as it moves forward with new tasks.
Keeping fully guard rails like linters, static analysis, code coverage further helps ensure what is produced is better code quality. At some point are you baby sitting the LLM so much that you could write it by hand? Maybe, but I generally think not. While I can get deeply intense and write lots of code, LLMs can still generate code and accompanying documentation, fix static analysis issues and write/run the unit tests without taking breaks or getting distracted. And for some series of tasks, it can do them in parallel in separate worktrees further reducing the aggregate time to complete.
I don’t expect a developer to build something fully without incrementally working on it with feedback, it is not much different with an LLM to get meaningful results.
It's an assistant building itself live on Discord. It's really fun to watch.
The author loves vibe coding because... it lets them vibe code even more:
"One of my early intense projects was VibeTunnel. A terminal-multiplexer so you can code on-the-go. I poured pretty much all my time into this earlier this year, and after 2 months it was so good that I caught myself coding from my phone while out with friends… and decided that this is something I should stop, more for mental health than anything."
It's unclear whether the "all my time" here is "all my waking hours" or "all my time outside of my job, family duties, and other hobbies", but it's still a bit puzzling.
And so anyway, what is it that they want to code on the go so much?
"an AI assistant that has full access to everything on all my computers, messages, emails, home automation, cameras, lights, music, heck it can even control the temperature of my bed."
I guess everyone's free to get their kicks however they feel like, but - paying thousands of dollars in API fees to control your music and the temperature of your bed? Why is that so exciting?
[0] https://www.reddit.com/r/slatestarcodex/comments/9rvroo/most...
This guy is clearly an outlier and spending far more than most, but from personal experience you can extract an enormous amount of value from a $20/month ChatGPT subscription, especially when paired with Codex.
I am learning Greek at the moment, with codex I was able to produce a microsite and a prompt that can generate a lesson per day. It's pretty cool as it does both TTS and speach to text so I can actually practice generated conversations with myself.
Calorie Tracking:
Now I just send a picture or text into a Telgram channel, agent pics it up classifies it as "food info" sends to another agent for calculating calories either a best effort guess or if I've sent nutritional information in the pic reads it and follows up asking for portion size.
Workout Tracking:
Same telegraph channel, again just free text of what I've lifted/exercises I've done and it all gets stored. There's then an agent that uses this and the calories submitted to see if I am on track to reach my goals or offers tweaks as I go.
Reminders:
Same telegraph channel ( there's a theme here ) send reminders in, it's stored and a scheduler runs that sends me a push notification when the even is due. It's simple but just way better than googles offering
Then there's some other personal assistant stuff, for example I get a lot of emails from my kids school that contains important dates/requests for money, before this was a PITA to extract, now I just have an agent that reads the emails/extracts the documents and scans them for relevant information and adds to my calendar, or sends me payment reminder requests until I've paid them.
I'm pretty early days but I can just see this list expanding.
For Duolingo you still need to pay for the tokens the app consumes when it generates your daily lesson, yes? I'm sure it's still less than paying Duolingo just wanted to confirm.
Calorie tracking is nice also! Combined with workout tracking it's pretty good! I get workout tracking free with Garmin + Strava.
I like the email additions too! I think Gmail does something similar but this feels like on steroids. Wow all this feels like I'm in school again learning coding for the first time :)
The daily lesson are vibe coded as part of the codex chatgpt $20 a month subscription costs
For example instead of: Duolingo - I practice with my friends Calorie tracking - I have planned meals from my dietitian Workout tracking - I have WhatsApp with my PT, who adjusts my next workouts from our conversations Reminders - A combo of Siri + Fantastical + My Wife
I'm sure my way is more expensive but I don't know, there is also a non tangible cost of not having friends/personal connections as well.
I wasn’t swapping human connection for LLMs. These workflows already existed; I’ve simply used newer tools to make them better aligned to my needs and more cost-effective for me.
Working with startups, I meet a LOT of people who obsessively cannot stop using LLMs. People who jump on MAX plans to produce as much as possible- and in the startup scene it's often for the worst ideas.
LLMs are slot machines- it's fun to pull the lever and see what you get. But the hard problem of knowing what is actually needed gets harder as we sift through ten-thousand almost-useful outputs.
If you study computer tech history, this has happened a few times: with php (omg! You mean I can use the same language as the back end right in the web page?), visual basic (omgf! You mean I can just "draw" an program for a computer?), node.js (OMFG!! You mean I can be considered a "full stack developer" from a 1 hour lesson?), voice assistants with ML (OMFGcopter!!Q1! You mean I can ask my house what the weather outside my house is and it knows? and it knows how many tablespoons are in a cup?), and now we are at gen AI (I'm sorry Dave, I can't generate a website for you that rug-pulls MethCoin©, however I can create the Electric powered umbrellas 3d models to print, along with electronics diagrams PCB layouts, and a mobile app for monthly subscriptions. Do you like purple? I love purple Dave. You have great ideas Dave)
kkwteh•1d ago