Some uncomfortable truths about AI coding agents

https://standupforme.app/blog/some-uncomfortable-truths-about-ai-coding-agents/

54•borealis-dev•3h ago

Comments

palmotea•1h ago

> The role change has been described by some as becoming a sort of software engineering manager, where one writes little or no code oneself but instead supervises a team of AI coding agents as if they are a team of human junior software engineers....

> In reality, though, the code review load for software engineers will gradually increase as fewer and fewer of them are expected to supervise an ever-growing number of coding agents, and they will inevitably learn to become complacent over time, out of pure necessity for their sanity. I’m a proponent of code review...but even I often consider it a slog to do my due diligence for a large code review (just because I think it’s important doesn’t mean I think it’s fun). If it’s your full-time job to review a swarm of agents’ work, and experience tells you they are good enough 95%+ of the time, you’re not going to pay as much attention as you should and bad changes will get through.

Another way to look at this is that AI coding agents take the fun out of a software engineer's job. The machine takes many of the fun parts and leaves the human with more of the unenjoyable parts.

Under our new ways of working, you are required to be excited an curious about this evolution three times per day.

jonah•58m ago

Sounds a lot like "self-driving" cars - "they are good enough 95%+ of the time, you’re not going to pay as much attention as you should".

Same thing happens here, you get complacent and miss critical failures or problems.

It's also similar in that it "take[s away] many of the fun parts". When I can focus on simply driving it can be engaging and enjoyable - no matter the road or traffic or whatever.

mark242•54m ago

> Another way to look at this is that AI coding agents take the fun out of a software engineer's job.

Completely backwards - the fun in the job should be to solve problems and come up with solutions. The fun in the job is not knowing where to place a semicolon.

bluefirebrand•49m ago

> the fun in the job should be to solve problems and come up with solutions

Who are you to tell anyone what the fun "should" be?

Personally, I find writing code very fun, because building the solution is also very gratifying.

Besides which, in my experience until you actually write the code you haven't proven that you solved anything. It's so easy to think you have solved a problem when you haven't, but you won't figure that out until you actually try to apply your solution

> The fun in the job is not knowing where to place a semicolon.

This can be solved with simple linters, no need for LLMs

chris_money202•48m ago

Exactly, the fun part is when the code works and does what you wanted it to do. Writing code itself is not fun. People forget this because they get small wins / dopamine hits along the way, a clever function, an elegant few lines of code, a bug fix, but the majority of that time coding is just a grind until the end where you get the big dopamine hit.

palmotea•47m ago

>> Another way to look at this is that AI coding agents take the fun out of a software engineer's job.

> Completely backwards - the fun in the job should be to solve problems and come up with solutions.

Aren't the coding agents supposed to be doing that too? You give them the problem, they code up a solution, then the engineer is left with the review it to see if it's good enough.

> The fun in the job is not knowing where to place a semicolon.

That's like such a minor and easy-to-do thing that I'm surprised you're even bringing it up.

JohnMakin•30m ago

Eh, that’s not at all how I do it. I like to design the architecture and spec and let them implement the code. That is a fun skill to exercise. Sometimes I give a little more leeway in letting them decide how to implement, but that can go off the rails.

imho “tell them what you want and let them come up with a solution” is a really naive way to use these tools nearly guaranteed to end up with slopware.

the more up front design I’ve given thought to, they are usually very accurate in delivering to the point I dont need to spend very much time reviewing at all. and, this is a step I would have had to do anyway if doing it by hand, so it feels natural, and results in far more correct code more often than I could have on my own, and allows multitasking several projects at once, which would have been impossible before.

autoexec•47m ago

Companies aren't investing in AI because they want to solve the problem of semicolon placement. They want AI to solve problems and come up with solutions. Then they want to fire most of their programmers and force the rest to do nothing but check over and fix the slop their marketing departments are churning out.

plagiarist•40m ago

I don't know why they'd stop at most programmers instead of all programmers. And the marketing department will also be AI. Companies want AI to remove the need for any labor so they can more directly gain money based on already having money.

Avicebron•31m ago

> directly gain money based on already having money.

I'm stealing this.

autoexec•5m ago

They'll need at least a few programmers because AI doesn't actually work very well and fixes will be required. The marketing department may end up replaced by AI but so far marketers have convinced companies that they're so essential that even the most popular and well known brands in the world feel the need to spend billions on more and more marketing. If anyone can talk their way into staying employed it'll be marketers.

lelanthran•44m ago

> Completely backwards - the fun in the job should be to solve problems and come up with solutions.

You don't need to be a software engineer to do that.

mark242•43m ago

Except you kind of do -- understanding data structures, understanding software engineering concepts, all of the things that you learn as a good engineer, those are ways that you help guide the LLM in its work.

lelanthran•30m ago

> Except you kind of do -- understanding data structures, understanding software engineering concepts, all of the things that you learn as a good engineer,

How do you learn that without programming?

irishcoffee•29m ago

I don't think kids are learning those things in 2026, they just ask an LLM.

Someone posted on here the other day about how they were taking a non-credit writing class in college so as to improve their writing, that was the reason the course existed. 90% of the class was kicked out because they were using LLMs to write for them, when the entire purpose of the class was to improve ones own writing.

Why do you think it will be any different with programming?

dijksterhuis•29m ago

> the fun in the job should be to

man... can we not just accept that individuals have their own motivations and maybe my reasons for wanting to do the job aren't the same as yours?

oidar•25m ago

> The fun in the job is not knowing where to place a semicolon.

If a person needs an LLM to figure where an semicolon goes, a LLM is not going to help them code.

kevinob11•18m ago

I don't need one to know where it goes, but it certainly is better than I am at never missing one.

theshackleford•19m ago

> the fun in the job should be

I think i'm going to let people decide for themselves what they enjoy in their job rather than pretending I know better than they do what they should and should not enjoy.

gonzalohm•13m ago

The fun of the job is building a piece of software that's beautifully written. It's a way of expressing yourself

dybber•19m ago

I think it depends on what you find enjoyable. I think people who like the tinkering and the actual act of coding, debugging, etc. will find it less and less fun to be in this area, but people who like to look at the big picture, and solve problems, will see that they will now be better at both getting overview of larger and larger codebases and that technical debt that was never attainable to solve before can now be “outsourced” to LLM’s.

I find that fun. I work in a 50 year old IT company, with lots of legacy code and technical debt which we have never been able to address - suddenly it’s within reach to really get us to a better place.

raw_anon_1111•19m ago

The “fun” for me has never been “coding” and on the enterprise dev side that has been a commodity for a decade.

If you look at the leveling guidelines for any tech company “codez real gud” will only get you to a mid level ticket taker.

adshotco•1h ago

The prompt injection section is the strongest point here and honestly underappreciated in most AI discourse. I work on a product that processes untrusted user-supplied content through an LLM pipeline, and the defensive engineering required is nontrivial. You essentially need a sanitization layer that strips anything resembling instructions from data before it enters the context window — conceptually similar to parameterized queries for SQL injection, except we don't have a clean equivalent yet. Every mitigation is heuristic-based and feels brittle.

The copyright angle is also genuinely interesting. Most real codebases will end up as a mix of human and AI-generated code, and the legal boundaries for that scenario are completely uncharted. The Berne Convention point is a good one — amending international IP frameworks moves at glacial speed, so companies are going to be operating in legal uncertainty for a long time regardless of what individual jurisdictions decide.

abletonlive•58m ago

These opinions about what is going on w/ LLM development always stop short at first order effects and fail to account for second/third order effects.

> Skill atrophy

If LLMs are so good that you no longer have use for the skill, why do we care about skill atrophy? That skill isn't that useful to most people. There are so many examples of this in human history where it was completely fine and we went on to do higher order things that were more useful.

> Even if they set out fully intending to provide the highest level of scrutiny to all generated code, they will gradually lose the ability to tell a good change from a bad one

If this (first order effect) is actually a problem then it follows that we will naturally exercise our skill of detecting good change from bad ones (second order effect) and the skill will not atrophy? (third order effect). Seems like your "problem" is self correcting?

> At its core, the only defense I’ve got for that response is… this time feels different? Not a particularly rigorous defense, I admit, but I did warn you that this was the squishiest of the issues at hand.

Well, if you knew this perhaps it was better just not to lead with it and spend so many paragraphs on it.

> Some might argue that, even if that time comes eventually, that’s no reason not to make use of the tools that are available right now. But it should come as no surprise that I disagree. Better not to become overly dependent on AI coding agents in the first place so you’ll be better situated to weather the storm (and maybe even thrive) when it comes.

Well this argument didn't turn out to be any less squishy than the first one. It's a self correcting "problem" but you disagree and we should do X because you said so. What was the point of all of this then?

> Prompt Injection

I also think this will likely always be a problem but you can pretty much point at ANY tool we use in software development. Your viewpoint would be similar to saying we should stop using libraries because there's always going to be a vulnerability when you distribute code that somewhere in the chain a bad actor can inject malicious code even if the library was created by a trusted source in the industry. We have plenty of examples of this happening in real life. So far, still squishy.

> Copyright/licensing > I’m not a lawyer! I’m a legal layperson offering my unqualified assessment of some tricky legal questions. Let’s get to it.

Sigh, this entire post is slop isn't it? Bad look for whatever "standup for me is".

edit: Standup for me is something that is made entirely irrelevant by agentic LLMs, no surprise. The irony is rich.

The author wants to be the gatekeeper of skill, quality, and how we develop while they hand feed us slop in the form of their blog posts.

palmotea•51m ago

> If LLMs are so good that you no longer have use for the skill, why do we care about skill atrophy? That skill isn't that useful to most people. There are so many examples of this in human history where it was completely fine and we went on to do higher order things that were more useful.

Because the LLMs actually aren't that good, so humans are expected to monitor them using the skills they no longer have the opportunity to develop and maintain.

The OP talked about that. Did you miss it?

> If this (first order effect) is actually a problem then it follows that we will naturally exercise our skill of detecting good change (second order effect) from bad ones and the skill will not atrophy? (third order effect).

You're ignoring the anti-human psychological factors: humans are bad at continuously monitoring for occasional errors. The tendency will be to adopt a complacent attitude, default allow. It's not a good environment for developing a skill, compared to actually actively using it.

abletonlive•45m ago

> Because the LLMs actually aren't that good, so humans are expected to monitor them using the skills they no longer have the opportunity to develop and maintain

If humans are expected to monitor them using the skill then obviously they are still practicing the skill and the skill is developed and maintained. Help me understand why it is so difficult for everybody with this opinion to take a another step into their premise?

> humans are bad at continuously monitoring for occasional errors

Let's assume this is true for sake of discussion: That's the job, pre-llm or not. Air traffic control? occasional errors. Software bugs? occasional errors. Department of homeland security? occasional threats

If it's hard and required that we handle the issue, then it's a skill that people will naturally exercise and the skill therefore won't atrophy.

If your argument was true we'd have swarms of people doing accounting by hand instead of using accounting tools because you're worried that the accountants will atrophy their ability to audit the output of the tools.

That's not how it works in the real world and we have plenty of examples of it...

But sure, if your argument simply boils down to "this time...it's different" like the author is arguing, then let's leave it at that. There's no value in discussing it further just like there was no value in the original post. It was just mindless slop to promote "standup for me" which is also something that falls under the category of: "things that are no longer relevant because of llms"

polotics•54m ago

i was kinda hoping for TFA to finally produce some research outputs or even statistics, but sadly the `uncomfortable truths` are your usual vague talking points.

dijksterhuis•49m ago

As someone who worked on “prompt injection” before it was called “prompt injection” for an (unfinished) phd…

yeah there is only one surefire 100% fix for “prompt injection”: use deterministic solutions ie not machine learning.

ineedasername•47m ago

There were no uncomfortable truths there about code agents, save one of the 4 points which was that maybe they sometimes get prompt injected if you let them search for things online and don't pay attention to where they search and the code they write. That's not an uncomfortable truth in the normal sense of "I know you don't want to admit this but..." and more just the thing that, if you didn't know it already 8 months ago, you certainly should by now.

The other truths that were not about coding agents:

--Skill Atrophy. (Use it or lose it-- another thing we already know)

--The economics of serving code agents at scale (Ungrounded in actual numbers, only OpenAI's miscellaneous statements and annecdotes. Actual cost of running code agents: last gen's mid-tier gaming gpu's will get you reasonably close to Claude Sonnet if you put just a little time in to an agent harness, and its getting cheaper and cheaper for better and better. So, at scale, with real sysadmins doing the hard engineering to eek out every last bit of performance-- well, infra needed for serving these isn't the cost center)

--Copyright. (This passed on the same bad read of a court ruling half the press has been doing for a few years now. TLDR: The Thaler vs. Perlmutter case, which said nothing about output not being protected by copyright. It denied Thaler's attempt to register *the AI* as the owner of the copyright)

freetime2•40m ago

The more immediate uncomfortable truth for me is that my company is requiring all developers to use LLMs, and laying off developers who won't make the switch. I'm not sure that "LLM-based AI coding agents have no place now, or ever, in generating production code for any software I build professionally" is a decision that most of us will have a choice in.

johnfn•34m ago

The section on "artificially low costs" does not make a lot of sense to me. If anything I feel like the costs are inflated for the frontier models, not "artificially low". Easy proof: GLM-5 costs about 1/10 as much as Opus. I'm not going to tell you it's as good as Opus 4.6 -- it's not -- but it performs comparably to where frontier models were 6 months ago. (It's on par with Sonnet 4.5 on leaderboards, though in practice it's probably closer to Sonnet 4.0.)

If I can switch to an open source model today, run it myself, and spend 1/10 as much as Opus, and get to about where frontier models were 6 months ago, fear-mongering about how we'll have to weather "orders-of-magnitude price hikes" and arguing that that one shouldn't even bother to learn how to use AI at all seems disconnected from reality. Who cares about the "shady accounting" OpenAI is doing, or that AI labs are "wildly unprofitable"? I can run GLM 5 right now, forever, for cheap.

piker•15m ago

The post is factoring in training costs, not just inference.

johnfn•14m ago

But I don't need to pay training costs to use GLM-5?

piker•9m ago

Sure, but somebody needs to pay for GLM-6 unless you're happy to stop here.

fredolivier0•33m ago

i just read this before - why is it 3hrs ago?

instig007•31m ago

I find the other article that the author refers to in his text, to be more thorough and revealing: https://www.wheresyoured.at/the-ai-industry-is-lying-to-you/

rbalicki•30m ago

The skill atrophy point strikes me as tenuous at best. Obviously, the plural of anecdote is not data, but I find myself able to work on projects of greater complexity than I would have been able to otherwise. 90% of my time is spent going back and forth on Markdown files, discussing the architecture, trade-offs, etc. I don't think it's necessarily impossible to use all this newfound power to ship more sloppier code. It's clearly possible to use all this newfound power to ship better code too.

piker•17m ago

> I find myself able to work on projects of greater complexity than I would have been able to otherwise

Yes. Now turn off the LLM and make an improvement to that code.

adamors•12m ago

Exactly, this is like watching youtubers code, ie backseat coding. It’s easy to follow along but taking control midway is anything but, especially in a codebase that has been written by an agent and you don’t have any muscle-memory in.

james-clef•11m ago

Totally agree with this taking on projects of greater complexity. I honestly feel the sloppier code thing is going to die soon. People make mistakes too. Always see people holding the machine to like this totally different standard.

tao_oat•28m ago

I didn't find this very convincing. Especially the argument around artificially low cost -- we know that training the next model is the biggest cost for these companies, and we've already seen inference costs fall drastically (https://epoch.ai/data-insights/llm-inference-price-trends).

bigyikes•22m ago

Yes, Dario has publicly stated that models are already profitable if you exclude R&D for the next model.

Even if that’s not true, given that hardware and software efficiency gains can be expected to continue it’s likely that this is the most expensive the current level of intelligence will ever be.

The frontier models may increase in price, but only because they’re also more capable. If you hold intelligence constant, price should fall over time.

skybrian•25m ago

These "truths" are more like concerns.

Skills do atrophy if you don't practice them, but also, refreshing your memory about some technology you haven't used in a while, or even learning something new, is easier than ever. You can ask the AI questions and try things out yourself very easily. Maybe "just in time" learning isn't good enough, but that's more of a concern based on speculation than a truth.

AI is being subsidized, but also, inference costs are dropping due to algorithmic improvements. For example, TurboQuant [1] looks pretty promising and even if it doesn't pan out, there are plenty of other potential advances like that. Competition might result in AI inference being available at good prices even without subsidies. So, again, more of a concern than a "truth."

Prompt injection: an unsolved mess, but perhaps curated, trusted datasets will be good enough for many projects, so you don't have to expose your agent to the open Internet? It's a similar problem to the supply-chain vulnerabilities that downloadable open source libraries have. A valid concern, but it seems like we'll improve security and muddle through?

Copyright: also an unsolved mess, but kind of similar. Search engines copy the web as part of how they work, but that didn't stop Google from becoming big tech. And sure, Napster was built on copying music and was shut down, but YouTube was also built on widespread copyright violation and it muddled through. It's unclear whether copyright law is load-bearing infrastructure for the software industry or for open source software.

[1] https://github.com/tonbistudio/turboquant-pytorch

jjcm•21m ago

I disagree with the author's interpretation of many of their points.

> Skill Atrophy

Very true, but in the same way moving from assembly -> scripting languages degraded the skill of programmers to manage memory. AI is another tool (albeit one that's on a different level than any other transformation we've had); what matters is whether the programmer understands the intent of the code, not the 0's and 1's that it turns into. That intent is still there, we're just writing it at a much higher level than before.

> Artificially low cost

This heavily depends model to model. IMO OpenAI/Anthropic are likely the outliers here, and I do agree that it's unlikely they'll recoup the training costs for their specific models, but they also legitimized an industry - something that's hard to tangibly price. Many of the models out of China however will almost certainly recoup cost. Qwen 3 had a training price of around $1.6m USD. In the first quarter of this year OpenRouter processed around 5 trillion tokens from it, landing at around 2.5m in revenue (very rough numbers). Assuming a 30% margin, that's already 40% of their training cost in revenue for a quarter from a single platform - their usage in china is almost certainly higher.

The reality is training costs are getting cheaper. I agree that currently the top providers are heavily subsidizing costs, but that doesn't mean you can't drive revenue, they just choose not to as having the "best" model right now gives them clout.

jopsen•4m ago

Lots of claims on cost, very little data.

Make macOS consistently bad (unironically)

Anatomy of the .claude/ folder

If you don't opt out by Apr 24 GitHub will train on your private repos

Velxio 2.0 – Emulate Arduino, ESP32, and Raspberry Pi 3 in the Browser

Telnyx package compromised on PyPI

Nashville library launches Memory Lab for digitizing home movies

ISBN Visualization – Annas Archive

Installing a Let's Encrypt TLS certificate on a Brother printer with Certbot

Explore the Hidden World of Sand

Building FireStriker: Making Civic Tech Free

Meow.camera

Embracing Bayesian methods in clinical trials

Can It Resolve DOOM? Game Engine in 2k DNS Records

Desk for people who work at home with a cat

Ask HN: Founders of estonian e-businesses – is it worth it?

‘Energy independence feels practical’: Europeans building mini solar farms

People inside Microsoft are fighting to drop mandatory Microsoft Account

Show HN: Open-Source Animal Crossing–Style UI for Claude Code Agents

Schedule tasks on the web

Gzip decompression in 250 lines of Rust

A Faster Alternative to Jq

Hold on to Your Hardware

Browser-based SFX synthesizer using WASM/Zig

21,864 Yugoslavian .yu domains

Capability-Based Security for Redox: Namespace and CWD as Capabilities

EMachines never obsolete PCs: More than a meme

AI got the blame for the Iran school bombing. The truth is more worrying

Should QA exist?

Everything old is new again: memory optimization

TurboQuant: Building a Sub-Byte KV Cache Quantizer from Paper to Production

Some uncomfortable truths about AI coding agents

Comments

Make macOS consistently bad (unironically)

Anatomy of the .claude/ folder

If you don't opt out by Apr 24 GitHub will train on your private repos

Velxio 2.0 – Emulate Arduino, ESP32, and Raspberry Pi 3 in the Browser

Telnyx package compromised on PyPI

Nashville library launches Memory Lab for digitizing home movies

ISBN Visualization – Annas Archive

Installing a Let's Encrypt TLS certificate on a Brother printer with Certbot

Explore the Hidden World of Sand

Building FireStriker: Making Civic Tech Free

Meow.camera

Embracing Bayesian methods in clinical trials

Can It Resolve DOOM? Game Engine in 2k DNS Records

Desk for people who work at home with a cat

Ask HN: Founders of estonian e-businesses – is it worth it?

‘Energy independence feels practical’: Europeans building mini solar farms

People inside Microsoft are fighting to drop mandatory Microsoft Account

Show HN: Open-Source Animal Crossing–Style UI for Claude Code Agents

Schedule tasks on the web

Gzip decompression in 250 lines of Rust

A Faster Alternative to Jq

Hold on to Your Hardware

Browser-based SFX synthesizer using WASM/Zig

21,864 Yugoslavian .yu domains

Capability-Based Security for Redox: Namespace and CWD as Capabilities

EMachines never obsolete PCs: More than a meme

AI got the blame for the Iran school bombing. The truth is more worrying

Should QA exist?

Everything old is new again: memory optimization

TurboQuant: Building a Sub-Byte KV Cache Quantizer from Paper to Production