> The models don’t have to get better, the costs don’t have to come down (heck, they could even double and it’d still be worth it), and we don’t need another breakthrough.
The costs should come down. I don’t know what costs this post refers to, but the cost of using Claude is almost definitely hiding the actual cost.
That said, I’m still hoping we ensure our public models out there work well enough with opencode or other options so my cost is more transparent to me, what is added to my electric bill rather than a subscription to Claude.
I don't think everything is for certain though. I think it's 50/50 on whether Anthropic/whoever figures out how to turn them into more than a boilerplate generator.
The imprecision of LLMs is real, and a serious problem. And I think a lot of the engineering improvements (little s-curve gains or whatever) have caused more and more of these. Every step or improvement has some randomness/lossiness attached to it.
Context too small?:
- No worries, we'll compact (information loss)
- No problem, we'll fire off a bunch of agents each with their own little context window and small task to combat this. (You're trusting the coordinator to do this perfectly, and cutting the sub-agent off from the whole picture)
All of this is causing bugs/issues?:
- No worries, we'll have a review agent scan over the changes (They have the same issues though, not the full context, etc.)
Right now I think it's a fair opinion to say LLMs are poison and I don't want them to touch my codebase because they produce more output I can handle, and the mistakes they make are too subtle that I can't reliably catch them.
It's also fair to say that you don't care, and your work allows enough bugs/imprecision that you accept the risks. I do think there's a bit of an experience divide here, where people more experienced have been down the path of a codebase degrading until it's just too much to salvage – so I think that's part of why you see so much pushback. Others have worked in different environments, or projects of smaller scales where they haven't been bit by that before. But it's very easy to get to that place with SOTA LLMs today.
There's also the whole cost component to this. I think I disagree with the author about the value provided today. If costs were 5x what they are now, I think it would be a hard decision for me to decide if they are worth it. For prototypes, yes. But for serious work, where I need things to work right and be reasonably bug free, I don't know if the value works out.
I think everyone is right that we don't have the right architecture, and we're trying to fix layers of slop/imprecision by slapping on more layers of slop. Some of these issues/limitations seem fundamental and I don't know if little gains are going to change things much, but I'm really not sure and don't think I trust anyone working on the problem enough to tell me what the answer is. I guess we'll see in the next 6-12 months.
When I look back over my career to date there are so many examples of nightmare degraded codebases that I would love to have hit with a bunch of coding agents.
I remember the pain of upgrading a poorly-tested codebase from Python 2 to Python 3 - months of work that only happened because one brave engineer pulled a skunkworks project on it.
One of my favorite things about working with coding agents is that my tolerance for poorly tested, badly structured code has gone way down. I used to have to take on technical debt because I couldn't schedule the time to pay it down. Now I can use agents to eliminate that almost as soon as I spot it.
Overall I like using it still but I can also see my mental model of the codebase has significantly degraded which means I am no longer as effective in stopping it from doing silly things. That in itself is a serious problem I think.
What worries me about this is that it might end up putting up a barrier for those that can't afford it. What do things look like if models cost $1000 or more a month and genuinely provide 3x productivity improvements?
But there is an interesting point about what it does to hobby dev. If it takes real money just to screw around for fun on your own, it's kinda like going back to the old days when you needed to have an account on a big university system to do anything with Unix.
Small bootstrapped startups
Are more what I had in mind. Of course an established company can pay it. I don't like the idea of a world where all software is backed by big companies
But yeah, I share your concern about open source and hobby projects. My hope would be that you get free tiers that are aimed at hobby/non-profit/etc stuff, but who knows.
You can bootstrap something with yourself and a friend with some hard work and intelligence
This is available to people all over the world, even those in countries where $1000 is a months salary
Microsoft and their employees will be fine, yeah. That's not who I'm thinking about
So… humans are now doing the stuff that computers are supposed to do and be good at?
Imagine if we had to suffer these posts, day in and day out, when React or Kubernetes or any other piece of technology got released. This kind of proselyting that is the very reason there is tribalism with AI.
I don't want to use it, just like I don't want to use many technologies that got released, while I have adopted others. Can we please move on, or do we have to suffer this kind of moaning until everybody has converted to the new religion?
Never in my 20 years in this career have I seen such maniacal obsession as it has been over the past few years, the never-ending hype that have transformed this forum into a place I do not recognise, into a career I don't recognise, where people you used to respect [1] have gone into a psychosis and dream of ferrets, and if you dare being skeptical about any of it, you are bombarded with "I used to dislike AI, now I have seen the light and if you haven't I'm sorry for you. Please reconsider." stories like this one.
Jesus, live and let live. Stop trying to make AI a religion. It's posts like this one that create the sort of tribalism they rail against, into a battle between the "enlightened few" versus the silly Luddites.
If such a guy is slowly dipping his toes into AI and comes to the conclusion he just posted, you should take a step back and consider your position.
I know what its capabilities are. If I wanted to manage a set of enthusiastic junior engineers, I'd work with interns, which I love doing because they learn and get better. (And I still wouldn't want to be the manager.) AIs don't, not from your feedback anyway; they sporadically get better from a new billion dollar training run, where "better" has no particular correlation with your feedback.
I agree on your specific points about what you prefer, and that's fine. But as I said 15 years ago to some recent Berkeley grads I was working with: "You have no right to your current job. Roles change."
AI will get better and be useful for some things. I think it is today. What I'm saying is that you want to be in the group that knows how to use it, and you can't there if you have no experience.
There's no both-sides-ing of genAI. This is an issue akin to street narcotics, mass weapons of war, or forever chemicals. You're either on the side of heavy regulation or outright bans, or you're on the side of tech politics which are directly harmful to humanity. The OP is not a thoughtful moderate because that's not how any of this works.
I don't think this has yet been established. We'll have to wait and see how it turns out. My inclination is it'll turn out like most other technological advancements - short term pain for some industries, long term efficiency and comfort gain for humans.
Despite the anti-capitalist zeitgeist, more humans of today live like kings compared to a few hundred years ago, or even 100 years ago.
But you seem to have jumped to a conclusion that everyone agrees: AI is harmful.
Many people are seeing this as an existential moment requiring careful navigation and planning, not just another language or browser or text editor war.
So that's how I think AI will be seen in 20 years: like the PC, the internet, and mobile phones. Tech that shapes society, for better or worse.
This is a tipping point and most anti-AI advocates don't understand that other software developers who keep telling them to reevaluate their positioned are often just trying to make sure no one is left behind.
For some people, that's picking up the tool and trying to figure out what its good for (if anything) and how it works.
Choosing not to use AI agents is maybe the only tool position I feel I've had to defend or justify in over a decade of doing this, and it's so bizarre to me. It almost reeks of insecurity from the Agent Evangelists and I wonder if all the "fear" and "uncertainty" they talk about is just projecting.
Some of those are before your time, but: The only time you don't get pushed to use new technologies is when a) nothing is changing and the industry is stagnant, or b) you're ahead of the curve and already exploring the new technology on your own.
Otherwise, everyone always gets pushed into using the new thing when it's good.
and then there is AS/400 and all the COBOL still in use which AI doesn't want to touch.
Docker has obvious benefits over bare metal.
Etc.
My own experiences with LLMs have shown them to be entertaining, and often entertainingly wrong. I haven't been working on a project I've felt comfortable handing over to Microsoft for them to train Copilot on, and the testimonials I've seen from people who've used it are mixed enough that I don't feel like it's worth the drawbacks to take that risk.
And...sure, some people have to be pushed into using new things. Some people still like using vim to write C code, and that's fine for them. But I never saw this level of resistance to git, Docker, unit tests, or CSS.
(Likewise with CVS to svn: "you can rename files now? and branches aren't horrible? Great, how fast can we switch?" - no "pushing" because literally everyone could see how much better it was in very concrete cases, it was mostly just a matter of resource allocation.)
In the context of this discussion, it feels more like ipv6 :-)
But when you stop trying new stuff (“because you don’t want to”), it is a sign that your inner child got lost. (Or you have a depression or burnout.)
Generally, sensible companies held off on this sort of transition until the tool was mature and stable.
Way back in the day, I was involved in an abortive move from CVS to SVN. It went great for a week, then the SVN db corrupted itself irretrievably, taking a week's work with it... I think we finally moved for real about a year later, when SVN had abandoned its extremely unreliable BDB backend that early versions used.
Forcing adoption of [AI tool of the month] now feels a bit more like, say, adopting Darcs back during the DVCS wars than adopting git after it had won.
If that is not interesting to you I think that’s a totally fine choice, but you’re getting a lot of pushback from people who have made a different choice.
At least with other advancements in our field like git, Docker, etc., they're made with a local-first mindset (e.g. your git repos can live anywhere and same with your docker images)
Also not all of us need to sell ourselves as high-speed AI-boosted developers, especially those with decades of experience. Investors might well choose to invest in artisanal coding, and many of us can act as our own investors as well. So the inevitability of agentism is still undecided IMHO.
Not doing so seems a bit like a farmer ploughing fields and harvesting crops by hand while seeking to remain competitive with modern machinery, surely?
That's probably true on some level for some evangelists, but it's probably just as true that some people who are a bit scared of AI read every positive post about it as some sort of propaganda trying to change their mind.
Sometimes it's fine to just let people talk about things they like. You don't know what camp someone is in so it's good to read their post as charitably as possible.
Because your boss is going to want you capable of using these things effectively even as shortly as 1-2 years from now? If not them, then their boss.
You mean the knowledge that Claude has stolen from all of us and regurgitated into your projects without any copyright attributions?
> But I see a lot of my fellow developers burying their heads in the sand
That feeling is mutual.
Also: Over the past 20 years, I could count the number of times on one hand that I was been able to get away with out-right copy/paste from SO.
And comes with a price tag paid to people who neither own nor generated that content. You don't think that shifts the ethical boundaries _significantly_?
I would very much like someone to give me the magic reproduction triple: a model trained on your code, a prompt you gave it to produce a program, and its output showing copyright infringement on the training material used. Specific examples are useful; my hypothesis is that this won't be possible using a "normal" prompt that's in general use, but rather a prompt containing a lot of directly quoted content from the training material, that then asks for more of the same. This was a problem for the NYT when they claimed OpenAI reproduced the content of their articles...they achieved this by prompting with large, unmodified sections of the article and then the LLM would spit out a handful of sentences. In their briefing to the court, they neglected to include their prompts for this reason. I think this is significant because it relates to what is really happening, rather than what people imagine is happening.
But I guess we'll get to see from the NYT trial, since OpenAI is retaining all user prompts and outputs and providing them to the NYT to sift through. So the ground-truth exists, I'm sure they'll be excited to cite all the cases where people were circumventing their paywall with OpenAI.
Then you have been mislead:
https://arstechnica.com/features/2025/06/study-metas-llama-3...
> I would very much like someone to give me the magic reproduction triple
Here's how I saw it directly. Searched for "node http server example." Google's AI spit out an "answer." The first link was a Digital Ocean article with an example. Google's AI completely reproduced the DO example down to the content of the comments themselves.
So.. don't know what to tell you. How hard have you been looking yourself? Or are you just trying to maintain distance with the "show me" rubrick? If you rely on these tools for commercial purposes then the onus was always on you.
> So the ground-truth exists
And you expect a civil trial to be the most reliable oracle of it? I think you know what I know but would rather _not_ know it.
As to your Ars article, I'm familiar because I read Ars.
> The chart shows how easy it is to get a model to generate 50-token excerpts from various parts of Harry Potter and the Sorcerer’s Stone. The darker a line is, the easier it is to reproduce that portion of the book.
50-token excerpts are not my concern, that's 40 words. The argument that the plantiffs need to make is that people are not paying for the NYT because ChatGPT (part of the four fair use pillars, I could expand, but won't). That's gonna be tough. Let's revisit this after the ruling and/or settlement.
You can't, and shouldn't be able to, copyright and hoard "knowledge".
You can twist this around as much as you like but there are several studies showing that LLMs and and will happily reproduce content from their training data.
Correct. But if read your code, produce a detailed specification of that code, and then give that code to another team (that has never seen your code) and they create a similar product then they haven't broken the law.
LLMs reproducing exact content from their training data is symptom of overfitting and is an error that needs correcting. Memorizing specific training data means that it is not generalizing enough.
That costs significantly more and involves the creation of jobs. I see this as a great outcome. There seems to be a group of people who share the opposite of my views on this matter.
> and is an error that needs correcting
It's been known for years. They don't seem interested in doing that or they simply aren't capable. I presume because most of the value in their service _is_ the copyright whitewashing.
> Memorizing specific training data means that it is not generalizing enough.
Is that like a knob they can turn or is it something much more fundamental to the technology they've staked trillions on?
I don't see it that way. If whatever you're doing can now be automated then it's become a bullshit job. It no longer a benefit to humanity to have a human sit on their ass, stand on their feet, or break their back to do a job that can be automated. As a software developer, it's my job to take the dumb repetitive stuff that humans do and make it so that humans never have to do that job again.
If that's a problem for society, it's because society is messed up.
> It's been known for years. They don't seem interested in doing that or they simply aren't capable.
I don't find that to be particularly big problem. Fundamentally an AI isn't just compressing all human knowledge and decompressing it on demand; it's tweaking parameters in a giant matrix. I can reproduce the lyrics of songs that I've heard but that doesn't mean there is a literal copy of that song in my brain that you could extract out with a well placed scalpel. It just means I've heard it a bunch of times and the giant matrix in my brain is tuned to be able to spit it out.
> Is that like a knob they can turn or is it something much more fundamental to the technology they've staked trillions on?
In a sense, it a knob. It's not fundamental to the technology; if it's reproducing something exactly that likely means it's over-trained on that data. It's actually bad for the models (makes them more incorrect, more rigid, and more repetitive) so that is a knob they will turn.
where are the websites that are lightning fast, where speed and features and ads have been magically optimized by ai, and things feel fast like 2001 google.com fast
why does customer service still SUCK?
AI has increased the sheer volume of code we are producing per hour (and probably also the amount of energy spent per unit of code). But, it hasn't spared me or anyone I know the cost of testing, reviewing or refining that code.
Speaking for myself, writing code was always the most fun part of the job. I get a dopamine hit when CI is green, sure, but my heart sinks a bit every time I'm assigned to review a 5K+ loc mountain of AI slop (and it has been happening a lot lately).
My medium term concern is that the tasks where we want a human in the loop (esp review) are predicated on skills that come from actually writing code. If LLMs stagnate, in a generation we’re not going to have anyone who grew up writing code.
1: not that LLMs write objectively bad code, but it doesn’t follow our standards and patterns. Like, we have an internal library of common UI components and CSS, but the LLM will pump out custom stuff.
There is some stuff that we can pick up with analysers and fail the build, but a lot of things just come down to taste and corporate knowledge.
I don't see why it doesn't help with reviewing, testing, or refining code either. One of the advantages I find is that an LLM "thinks" differently from me so it'll find issues that I don't notice or maybe even know about. I've certainly had it develop entire test harnesses to ensure pre/post refactoring results are the same.
That said, I have "held it wrong" and had it done the fun stuff instead and that felt bad. So I just changed how I used it.
I read anecdotes of teams that push through AI-driven changes as fast as possible with awe. Surely their AIs are no more capable than the ones I'm familiar with.
I still think whether you see sustained value or not depends a lot on your workflow -- in what you choose to do or decide and what you let it choose to do or decide.
I agree with you that this idea of just pushing out AI code -- especially code written from scratch -- by an AI sounds like a disaster waiting to happen. But honestly a lot of organizations let a lot of crappy code into their code-base long before AI came long. Those organizations are just doing the same now at scale. AI didn't change the quality, it just changed the quantity.
Many of these techniques can also work with Chinese LLMs like Qwen served by your inference provider of choice. It's about the harness that they work in, gated by a certain quality bar of LLM.
Taking a discussion about harnesses and stochastic token generators and forcing it into a discussion of American imperialism is making a topic political that is not inherently political, and is exactly the sort of aggressive, cussing tribalistic attitude the article is about.
TBH i'm fine with AI but my main concern isn't any of these issues (even if they suck now -though supposedly Claude Code doesn't- they can get better in the future).
My main concern, by far, is control and availability. I do not mind using some AI, but i do mind using AI that runs on someone else's computer and isn't under my control - and i can, or have a chance at, understanding/tweaking/fixing (so all my AI use is done via inference engines that are written in C++ that i compiled myself and are running on my PC).
Of course the same logic applies to anything where that makes sense (i.e. all my software runs locally, the only things i use online/cloud versions for are things which are inherently about networking - e.g. chat, forums, etc, but even then i use -say- a desktop-based email client instead of webmail).
If it produces value for you, you should use it. If not, don't.
This is a genuine question -- I really don't understand. I appreciate local tooling when it helps my long-term efficiency, even if there's a learning curve. But not if cloud seems like it will always be more efficient. And while there are LLM's you can run locally, it doesn't seem like the ones useful for coding, with their vast memory and GPU requirements, will be realistic or cost-effective to run locally in the foreseeable future.
Because cost-effectiveness is a short term concern compared to...
> what is the benefit of control?
...the independence that being in control provides you in the long term. As for why to be independent, i hope it should be self-evident that being able to do what you want and work on without having to rely on 3rd parties for a core component of that work is a good thing.
And TBH i'm not sure why being fast at the cost of everything else (especially of independence and control) is even considered a good thing in the first place.
To be honest, not really.
I have a million limitations in my life. Trying to achieve some kind of "independence" is not something I understand. I prefer to accept a kind of interdependence, to be part of an ecosystem. To work together, in sync, for mutual benefit.
I rely on third parties for my food, my housing, my health, my education, my technology, all of it. Using an LLM hosted elsewhere feels no different from using electricity generated elsewhere, or food grown elsewhere, or a computer manufactured elsewhere. So why the difference for you?
Same but trying to add more limitations (in my view a reliance on a 3rd party to do what i want is a limitation) is not something i like to add without having an incredibly good reason without alternative options.
> So why the difference for you?
In general because each dependence comes with requirements and expectations (many of which i may not even know ahead of time) from my side. The simplest and most straightforward one when it comes to cloud LLMs would be the requirement to have internet connection (which i may or may not have, for a variety of reasons) and of course money to pay for it - and, at least with the way LLMs are currently monetized, that money would depend on how much i need it - and i may either not have that money or not want or even be able to spend it (again for whatever reasons). Even if someone else would pay (e.g. a workplace) this can have indirect effects, like my employer using the LLM use (either via how much i'd cost them for its use or how much i'm using it by counting tokens - the latter of which is something many people have mentioned is already being done, though for now it is to maximize LLM use as CEOs are still in their FOMO phase), which in turn have negative consequences for me.
Just like Microsoft nowadays has almost zero incentive to provide a good quality OS despite Linux existing, since they've captured an overwhelming majority of the desktop space, there is no guarantee that once some LLM provider captures the overwhelming majority of a market wont jack up prices and let quality languish even if there are theoretically alternatives - especially if said provider has built a dependency moat around it with various tools that only work with their LLMs (some LLM providers make their own tools and this isn't out of the good of their hearts).
But there is more to it than just the obvious stuff above. Being in control means nobody will force you do or not do something you dislike - even if you end up doing the same thing down the road, it'd be your decision, not someone else's forced on you.
One example i'm certain many people would have encountered is software updates making the experience of existing users worse. With something cloud-based there isn't much you can do - what if i liked the original GMail, YouTube or even Facebook interfaces more than their current incarnations? There is nothing i can do about it, i just have to accept that i have no control over them. The best i can do is hope that the developers, like in Reddit's case for example, would leave the old UI around and not mess with it much - but even then, i'm at the mercy of those developers, not in control myself. And while with something like GMail i could at least use a desktop application (and hope GMail doesn't remove the feature that make that possible), the core features of YouTube, Facebook and Reddit are mainly their userbases, not their UIs - i do not visit Facebook because i like how it works or behaves, i visit it because it is a point of contact with some family members and acquaintances. Similarly, i do not visit Reddit because i like its UX, i visit it because of the stuff people post and comment there.
Another example, more relevant to LLMs, would be when OpenAI upgraded ChatGPT from 3.5 to 4 or something like that (i do not use ChatGPT so i do not know) and people really disliked the change of tone their chatbots had. Say whatever you want about if that was good or not (though it'd be beside the point i'm trying to make), but ultimately, it was a clear example of someone in power (OpenAI) making changes that some of their users greatly disliked but had zero control or power to do anything about it. AFAIK a similar (though less publicized) issue was when Anthropic changed Claude 3 to Claude 4 but AFAIK Claude 3 still remains available - but that is, like with Reddit's case, because of Anthropic's "benevolence" (as long as it is financially viable for them, of course).
Willingly exposing myself to more dependencies, when my experience so far has shown that they come with long term consequences that are often not aligned with my desires isn't something i like doing. As you implied, there are already aspects of life where we do not have much control, but to me the existence of those acts more of an incentive to avoid losing further control where i can than to give up on it entirely.
On the topic of LLMs, from a personal perspective at least, if local LLMs end up being completely inadequate and making software becomes a matter of developers becoming little more than "remote LLM operators" then i'll just treat being a "remote LLM operator" the same way as being a secretary or accountant: something that i'm not interested in, even if their work often involves using computers.
It seems like you have what might be called an extreme sense of loss aversion, and so the more control and independence you have, the more you can prevent loss.
In contrast, I don't really have that. Sure I get annoyed when a software interface changes, but at the same time I see that the updates overall have also given me 10 other features I really appreciate, and so I see it as a net win. On the whole, I find that being embedded in a web of up-to-date dependencies has always been a large net positive on the whole. There are losses, but they are far outweighted by the wins, so whenever a loss bugs me I just remind myself of all the new helpful stuff. Like, Spotify's changes to UX drive me nuts sometimes. But they recently launched prompted playlists that have been a game changer for me. They added transitions between songs which is awesome. I'm using them to listen to audiobooks my library doesn't have. So I can put up with the UX.
But if you experiences losses psychologically as 10x the size of wins of the same "objective" size, then your calculus could be different. Pretty much everybody has loss aversion to some extent, it's considered a standard human trait -- I have to remind myself to put things into perspective myself sometimes -- but it sounds like you have a much stronger sense of it, so the control that greater independence gives you is much more valuable to you than it is to someone like me.
So that's why, when you say, "i hope it should be self-evident" -- it's not self-evident to someone like me at all, but I can see why it seems self-evident to you.
It is another post that advocates for AI assisted coding without addressing the question of responsibility and trust. It makes claims without offering test data or even talking about testing.
Others will be here to say "Just another evangelist telling us we're going to miss out".
To add a bit more of an interesting take (because all the arguments at this point are soooooo boring), my main issue at this point isn't whether I find it useful or not (I 100% do), it is that I'm now relying on claude a bit too much and I find it frustrating when I work offline. I am wary of that, a lot actually.
pier25•1w ago
What about 10x more?
hackyhacky•1w ago
pier25•1w ago
njhnjhnjhnjh•1w ago
Edit: If I get a raise, I'd consider paying up to $25,000 per year for the aforementioned Claude automaton.