AI: Accelerated Incompetence

https://www.slater.dev/accelerated-incompetence/

303•stevekrouse•5mo ago

Comments

h1fra•5mo ago

At first, I was not convinced then I thought I would be replaced, now I'm convinced the AI slop is unstoppable and having the hard skills will quickly be invaluable

CuriouslyC•5mo ago

The people who write AI slop would have written human slop, they're just much faster now, and the people who are really good tend to be so set in their ways that they resist AI when they could quadruple their productivity if they just put half as much time into learning how to use it as they put into their current skillset.

vouaobrasil•5mo ago

You are omitting a third class: the people who would not have written anything at all but are now writing AI slop because the bar has been lowered.

namaria•5mo ago

Pretty much the reason I think electronic music was fantastic right up to the point DAWs became a thing. When everyone with a laptop and some software licenses can just put out slop that sounds like the originators, quality fell off the cliff. 90s and 2010s electronic music are a world apart.

conradfr•5mo ago

You're not wrong but there's entire jams on YouTube from people with real gear and no DAW that produce uninteresting synth based music.

Yet The Prodigy made good albums entirely with Reason.

namaria•5mo ago

I'm not sure what point you're making. I never claimed any set of tooling is necessary of sufficient to produce good music. I merely pointed out that, as it became more accessible to produce music, the overall quality of the output decreased.

conradfr•5mo ago

I think the volume of good music has actually increased, but maybe less than the bad music.

vouaobrasil•5mo ago

Yeah, it seems to be a problem, especially when it comes to 'discoverability'. Just like the app store: I hate browsing app stores like Apple/Google Play and stopped doing it entirely because every query has 40+ different apps doing very similar things.

johnecheck•5mo ago

Great engineers can quadruple their productivity with LMMs? Strong claim.

If their whole job is throwing together disposable demos and 1-use scripts, I could believe 4x. But in the normal case, where the senior engineer's time is mostly spent wrangling a legacy codebase into submission? I just don't see LLMs having that level of effect.

CuriouslyC•5mo ago

Depends on how coupled the codebase is. Heavily coupled code is hard for LLMs to modify, but brownfield work that's not heavily coupled, you can easily refactor and improve coverage at a much higher rate with LLMs.

rTX5CMRXIfFG•5mo ago

Honestly there are good reasons to resist AI even as a senior. If prompting an LLM cannot be easily integrated into your existing workflow or demands a huge change in tooling, it's just distracting or crippling. I don't think we need to have a long discussion as to why distractions and context switches are counterproductive.

CuriouslyC•5mo ago

You can say that, but when the principle down the hall from you who's got a highly tuned agentic AI workflow is outproducing half your team by himself while leading engineering direction on multiple projects, it's hard to justify that to management.

rTX5CMRXIfFG•5mo ago

I’m sure it will all come down to their risk appetite.

blibble•5mo ago

it's an "if", not a "when"

this outcome is far from certain (unless you're a slopper, in which case it's obviously happening tomorrow)

jgalt212•5mo ago

The AI slop will continue as long as the stock market rewards it.

8f2ab37a-ed6c•5mo ago

Is AI perhaps an opportunity for years of paid cleanup and re-design work that those of us who have dedicated decades to learning the craft of software engineering will be able to cash in on?

fouronnes3•5mo ago

"Software Engineering in the Age of AI" is the new "Maintaining Legacy Code Effectively"

90s_dev•5mo ago

Yeah I was thinking there's gonna be a lot of work for us in a couple years when AI makes a giant mess and corporations come crawling back to us with their tails between their legs. Now I really need to jumpstart repurposing 90s.dev to become essentially a community of the people you're describing. What would we need to start with? Mailing list? bboard? [edit] posted a top level comment with this question

anthk•5mo ago

Usenet and IRC groups ;)

No, seriously. If you have the skills to join, yo should be able to handle the proper way of programming with no AI at all.

silvestrov•5mo ago

This will require companies to be able to survive for some time using the old code and to recognise the importance of cleaning up.

Has this ever happened besides the Y2K fixes?

Wouldn't it be much more likely that the companies will simply go under? Or that they will make a team that writes a completely new version of the code, somewhat like Mac OS X was a replacement of MacOS 9 and not a cleanup.

zikzikzik•5mo ago

> Or that they will make a team that writes a completely new version of the code

Isn't that one way of the "cleanup and re-design work"?

pydry•5mo ago

Yeah, it happened at the end of the last outsourcing boom. It's often a very quiet transformation though - a tacit recognition that a fashion failed doesn't get the same PR as "groundbreaking new business fad" because to do so too explicitly would humiliate all of the executives who were the driving force the original fashion.

Success has many fathers, failure is an orphan.

Publications like the Economist or the WSJ that drive a lot of this hype among the investor and executive classes are loath to point out that their readers and owners are the proverbial emperor not wearing any clothes.

One of the benefits of tech unions (if they existed in any meaningful way) would be to point out where the emperor is naked in a way that is a bit harder to ignore than the kerfuffle that occurs inside hacker news threads or subreddits dedicated to experienced devs.

pixl97•5mo ago

>benefits of tech unions

Careful speaking such heresy around here, you might get burned at the stake.

mrkeen•5mo ago

I think this is too optimistic.

All of us are already in tech debt. The post-AI mess won't look significantly different from the pre-AI mess.

If anything I would expect management to throw more money at AI in hopes of fixing the mess (if management even perceives such a mess).

rhubarbtree•5mo ago

If AI creates a mess, why won’t future AI be able to continue building a mess on top?

And if it does hit a dead end, just regenerate a new version of the entire system in minutes.

I don’t know what’s going to happen with AI coding, but it often seems to me that people are making fundamental errors when framing the problem.

“But how will humans maintain AI code?” Is one such example. Why would we expect one part (code creation) to change dramatically without all other parts equally undergoing a revolution?

klabb3•5mo ago

> If AI creates a mess, why won’t future AI be able to continue building a mess on top?

Because due to how complexity works, you reach a state where the expected number of breakages from modifying the code base exceeds the value from the change itself. Even if you assume the cost of labor is 0. It’s like the monkeys writing Shakespeare.

> And if it does hit a dead end, just regenerate a new version of the entire system in minutes.

If this worked it would have been done every few years at big companies. In reality, prototypes cannot take the place of a legacy system because it’s deeply integrated and depended upon. Most notably through the data it owns, but in many subtle ways as well. Simply verifying that a new system will not break is a massive undertaking. Writing meaningfully testable systems is an order of magnitude harder than implementing them.

When there’s a monetary risk of bugs (lose data, lose users, mess up core business logic etc) companies pay for the confidence to get things right. Doesn’t mean it always works, or that their priorities are right, but a payment vendor is not going to trust vibes to do a db migration.

There are still many experimental prototype domains out there, like indie games and static web sites, where throw away and start over is pretty much fine. But that’s not the entire field.

rhubarbtree•4mo ago

That’s not how it’s going to work. We’re not going to keep modifying code, we’re going to modify specs that map to code.

Lazarus_Long•4mo ago

So AI is going to create UML or 5th gen language specifications? Hurrah!.

rhubarbtree•4mo ago

We will write or speak natural language specs with an AI aid.

Another model will translate that very reliably into code.

Other models will assist in testing and checking and fleshing out specs.

strange_quark•5mo ago

No, IMO, the whole tech industry is cooked. AI is pitched as the next big thing (tm) that will delivery hyper-growth again, and when LLM-based AI doesn't live up to the insane claims being made, the bubble popping will take out players up and down the value chain, investment in tech overall will get massively pulled back, and the job market will get flooded by layoffs.

bravetraveler•5mo ago

> If you're a skilled, experienced engineer and you fear that AI will make you unemployable, adopt a more nuanced view. LLMs can't replace human engineering.

Ah, but you see, without any promise of pudding: why eat the meat? I do the 'human engineering' part only to do... the rest of the engineering.

hawk_•5mo ago

'human engineering' here isn't social engineering. I think they're referring to aspects of engineering that LLMs are still incapable of - like design simplification.

bravetraveler•5mo ago

I follow... it's just, I much prefer to work with the computer than the humans. Simplification usually involves a lot of talking.

edit: The [currently] Dead reply below is a touch ironic. I rest my case. Great sell, worth the squeeze.

cucubeleza•5mo ago

I was hoping the AI could replace me, maybe the IA would fulfill my boss deadlines

90s_dev•5mo ago

Tell your boss a realistic deadline, honor it to the best of your ability, and if they fire you so be it, and if they harass you to work faster, find another job.

stevage•5mo ago

I'm really appreciating some of the writing en AI that is coming out now. Beyond simply "vibe coding is bad" but advancing arguments about the specific risks.

monero-xmr•5mo ago

I use LLMs for annoying shit that used to take an inordinate amount of time. For example an analyst gives me a CSV, it has some data clean up issues, eventually it needs to become a series of SQL inserts.

What was a day of script writing becomes 15 minutes of prompt engineering to clean up the CSV and convert it into the proper statements. Massive time savings!

I believe one can vibe code a basic app or script. Like building one of those little trinkets people put on websites, like calculators or mini games.

But I agree, the LLM can’t build, say, Google Chrome, or AWS Cognito, or Netflix, or anything that actually has serious billion dollar value. It’s not a “when” question, because these are not “solvable” problems.

vouaobrasil•5mo ago

> I use LLMs for annoying shit that used to take an inordinate amount of time. For example an analyst gives me a CSV, it has some data clean up issues, eventually it needs to become a series of SQL inserts.

But that has a flip-side too: analysts are more likely to give you CSVs with more issues because they know you can clean it up with AI. And pretty soon people will put less care in general in what they do until something goes wrong with this translate-with-AI paradigm.

nyrikki•5mo ago

Honest question, what type of CSV to SQL code are you doing whre correctness doesn't count?

Perhaps my few decades in the industry have been in areas whree it is always the details, correctness, and fitness for purpose that tends to make those problems hard, not the work itself.

I do see a use case for throw away spikes, or part of a red-green-refactor, etc.. but if accuracy and correctness aren't critical, data cleanup is easy even without an LLM.

zhivota•5mo ago

In my experience the difficulty in this kind of task is reading the docs of a bunch of packages I haven't used in months/years and probably won't use again anytime soon, testing things manually and creating all the little harnesses to make that work without running for minutes at a time, etc.

Sure for someone who does ETL type work all day, or often enough anyway, they'd scoff, and true LLM won't really save them time. But for me who does it once in a blue moon, LLMs are great. It's still on me to determine correctness, I am simply no longer contending with the bootstrap problem of learning new packages and their syntax and common usage.

viraptor•5mo ago

Similarly for me, my visualisation pipeline changed from "relearn matplotlib and pandas every single time" to "ask for code, fix up details later". In this case the time saving scales with how much I forgot from the docs and the last time. I need to do the review and debugging either way, so that's moot.

esafak•5mo ago

It's not your fault their APIs suck!

hooverd•5mo ago

There's two schools of thought here: viewing LLMs as machines to replace your thinking, and viewing LLMs as a vast corpus of compressed knowledge to draw upon.

nyrikki•5mo ago

The CSV to SQL for analysts problem is a data integrity problem that is domain specific and not tool specific.

Remember that a 'relation' in relational databases is just a table, specifically named columns and tuples (rows).

A CSV is also just tuples (lines), but obviously SQL also typically has multiple normalized tables etc...

Typically bad data is worse than missing data.

For analysts, missing data can lead to bias and reduced statistical challenges, but methods exist and it can often be handled.

Bad data, on the other hand, can be misleading, deceptive and/or harmful. An LLM will be its very nature, be likely to produce bad data when cleaning.

The risk of using an LLM here is that it doesn't have context or nuances to deal with that. Data cleaning via (sed,grep,tr,awk), language tools or even ETL can work....

I promise you that fixing that bad data will be far worse.

But using it in a red-green-refactor model may help with the above, but you will actively need to be engaged and dig through what it produces.

Personally I find it takes more time to do that than to just view it as tuple repacking...and use my favorite tools to do so.

Data cleaning is hard, but it is the context specific details that make it so.

rhubarbtree•5mo ago

How do you check to make sure that the LLM hasn’t deleted any data, or changed something you didn’t want to be changed?

A colleague did this recently and found it had removed crucial data.

Better to get it to help you author scripts or difficult parts of scripts, and then review the code carefully.

add-sub-mul-div•5mo ago

If you learn the skills that's nowhere near a day of work and then you can do it forever without dependence.

nottorp•5mo ago

The problem with statements like yours is that everyone praises the LLM for doing things they (the humans praising) can do in their sleep.

You sound like you've done that CSV to SQL a lot of times, are subconsciously aware of all the pitfalls and unwritten (and perhaps never specified) requirements and you're limited by just your typing speed when doing it again.

I can use LLMs for stuff I can do in my sleep as well.

I move that even for you or me, LLMs ain't worth much for stuff we can't do in our sleep. Stuff we don't know or have only introductory knowledge of. You can use them to generate tutorials instead of searching for them, but that's about it.

Generating tutorials is good, but is it good because a LLM did it or because you can't find a good tutorial by searching any more?

auggierose•5mo ago

> The text in this essay was written without any use of AI.

That's fine, although I think only native English speakers would be proud of that. I am sure he also didn't use a spell-checker.

consp•5mo ago

> I am sure he also didn't use a spell-checker

I'm not sure that would matter with/to most people.

I can only speak for myself and I don't care unless I'm reading a text by someone who claims to be either an English teacher or Linguistic subject expert.

auggierose•5mo ago

Yes, it should not matter. Neither should it matter if an AI was used or not. It should be judged on the quality of the article. After all, nobody can actually check if he used an AI or not. He may just be lying about that.

EncomLab•5mo ago

Remember when 3d printing was going to replace all manufacturing? Anybody?

AI is closer to this sentiment than it is to the singularity.

thinkindie•5mo ago

or bitcoin going to replace banks? We ended up with banks selling financial tools based on bitcoin.

Etheryte•5mo ago

This is an easy quip to make, but it's also pretty wrong. 3D printing has been a massive breakthrough in many industries and fundamentally changed the status quo. Aerospace is a good example, much of what SpaceX and other younger upstarts in the space are doing would not be feasible without 3D printed parts. Nozzles, combustion chambers, turbopumps etc are all parts that are often printed.

karn97•5mo ago

That's less than 1% of all manufacturing done on the planet. Stupid comment for the sake of commenting.

rak111esh•5mo ago

OP's comment and your response could both be true at same time

dingnuts•5mo ago

I actually think the response makes the point. LLMs are useful, and will provide certain innovations, but they aren't a panacea. At the top of its height, proponents talked like 3D printing was going to make parcel delivery obsolete, and that's the same hype you see with language models "threatening" all knowledge workers

dcminter•5mo ago

Unless you believe that 3D printing is "going to replace all manufacturing" then the OP is not "pretty wrong" and you don't even disagree with them.

FWIW I think OP came up with an excellent analogy.

fuelled6532•5mo ago

I'd give up my 3D printer long before letting go of my Bridgeport...

nssnsjsjsjs•5mo ago

3d printing will slowly edge its way into more manufacturing. The humble stepping motor really is eating the world. 3dp is one manifestation of it!

Back to AI though.

I just checked the customer support page of a hyped AI app generator and its what you expect: "doesn't work on complex project" "wastes all my tokens" and "how to get a refund"

These things are over promising and a future miracle is required to justify valuations. Maybe the miracle will come maybe not.

dylan604•5mo ago

> 3d printing will slowly

I'm not sure why you continued using words when you summed up 3D printing with those four words. In the time it takes to print 1 object, you could have molded thousands of them. 3D printing has done a lot for manufacturing in terms of prototyping and making the first thing while improving flexibility for iterations. Using them for mass production is just not a sane concept.

nssnsjsjsjs•5mo ago

Yes for sure. 3d printing isn't going to replace everything because 3d printing is a type of manufacturing method with pros and cons.

But in the time it took me to convert a picture of my cat to a 3d model using AI and print it, I could have ... got on the phone to the injection molding lab, and asked about availability to produce the mold for that cat.

3d printing fits the niche where either you need to model or make something bespoke that it isn't worth setting up custom machinery.

The point is 3d printing is useful and the tech is improving and it will get more and more useful. It won't take over manufacturing of course (just like Rust won't take over all programming).

johnecheck•5mo ago

Great analogy. 3d printing is awesome and incredibly useful tech. Truly world changing. But injection molding is here to stay.

Filligree•5mo ago

Though we did figure out how to do injection molding with a 3d printer. In a printed mold.

dangus•5mo ago

Even the phrase "world-changing" might be a bit too strong.

It's enabled some acceleration of product prototyping and it has democratized hardware design a little bit. Some little companies are building some buildings using 3D printing techniques.

Speaking as someone who owns and uses a 3D printer daily, I think the biggest impact it's had is that it's a fun hobby, which doesn't strike me as "world-changing."

johnecheck•5mo ago

That's fair, but don't sell them short. A 3d printed gun just killed a CEO. Ukrainian drones are using 3d printed parts to drop bombs.

Between that and the changed game for hobbyists, the world is meaningfully different.

Most world-changing inventions do so subtly. Atom bombs are the exception, not the rule.

dangus•5mo ago

I think the phrase "world-changing" implies a lot less subtlety than that.

johnecheck•5mo ago

That's fair. My definition is probably broader than most.

bryancoxwell•5mo ago

I honestly don’t, although maybe that hype cycle was before my time.

But this seems an unfair comparison. For one, I think 3D printing made me better, not worse, at engineering (back when mechanical engineering was my jam), as it allowed me to prototype and make mistakes faster and cheaper. While it hasn’t replaced all manufacturing (or even come close), it plays an important role in design without atrophying the skills of the user.

gwbas1c•5mo ago

I find AI has the potential to do that (in my software development job): But so far I'm only using it occasionally, probably not as often as you used 3D printing.

jrockway•5mo ago

Honestly, both are pretty good for prototyping. I haven't found AI helpful with big picture stuff (design) or nuts and bolts stuff (large refactorings), but it's good at some tedium that I definitely know how to do but guess that AI can type it in faster than I can. Similarly, given enough time I could probably manufacture my designs on a mill/lathe but there is something to be said for just letting the printer do it when plastic is good enough (100% of my projects; but obviously I select projects where 3D printing is going to work). Very similar technologies and there are productivity gains to be had. Did the world change because of either? Not really.

baq•5mo ago

I have a 3d printer and cadded me some parts, I don't have an injection molding plant.

lolinder•5mo ago

Yep, I think this further illustrates OP's point—hobbyists building low-stakes projects get enormous benefits from LLM tooling even while professionals working on high-stakes projects find that there are lots of places where they still need something else.

MisterTea•5mo ago

Because you dont have a need to mass produce.

ak_111•5mo ago

It might not lead to singularity but for people who work in academia, in terms of setting and marking assignments and lecture notes, for good or bad AI has had an enormous impact.

You might argue that LLMs have simply exposed some systematic defects instead of improving anything, but the impact is there. Dozens of lecturing workflows that were pretty standard 2 years ago are no longer viable. This includes the entirety of online and remote education which ironically dozens of universities started investing in after Covid, right around when chatgpt launched. To put this impact in context, we are talking about the tertiary and secondary sector globally.

illiac786•5mo ago

I fully agree, in academia it truly is a revolution - for good and for bad.

There will be the before and after AI eras in academia.

riffraff•5mo ago

> This includes the entirety of online and remote education

I don't get this. Either you do graded home assignments which the person takes without any examiner, which you could always cheat on, or you do live exams and then people can't rely on AI . LLMs make it easier to cheat, but it's not a categorical difference.

I feel like my experience of university (90% of the classes had in-person exams, some had home projects for a portion of the final marks) is fundamentally different from what other people experienced and this is very confusing for me.

cultofmetatron•5mo ago

for small run manufacturing, 3d printing is absolutely killing it. the major limitation of 3d printing is that it will never be able to crank out thousands of parts per hour the way injection molding can and thats ok. creating an injection molded part requires a huge up front investment. if you're doing small runs of a part, 3d printing more than makes up for the slow marginal time with skipping the up front costs alltogether.

on_the_train•5mo ago

Or 3d cinema. Or vr. All just hype bs

spyckie2•5mo ago

Not saying that it’s not true but hardware and software have different trajectories.

kenjackson•5mo ago

Well yeah, the singularity isn't close by any measure.

But 3D printing and AI are on totally different trajectories.

I haven't heard of Mattel saying, "we're going to see in what places we can replace standard molding with 3d printing". It's never been considered a real replacement, but rather potentially a useful way to make PoCs, prototypes and mockups.

IshKebab•5mo ago

No, because 3D printing was never going to replace all manufacturing. Anyone who said that didn't even have a basic understanding of manufacturing, and I don't recall any serious claims like that.

3D printing has definitely replaced some manufacturing, and it has had a huge effect on product design.

These anti-AI articles are getting more tedious than the vibe coding articles.

UncleEntity•5mo ago

> Remember when 3d printing was going to replace all manufacturing? Anybody?

Sure, but I'd argue the AIs are the new injection molding (as mentioned downthread) with the current batch being the equivalent of Bakelite.

Plus, who seriously said 3d printers were going to churn out Barbies by the millions? What I remember is people claiming they would be a viable source of one-off home production for whatever.

vouaobrasil•5mo ago

> If you're a skilled, experienced engineer and you fear that AI will make you unemployable, adopt a more nuanced view. LLMs can't replace human engineering.

I don't think LLMs were ever meant to completely replace human engineering, at least in the way we think of engineering as a craft. But the truth is the world is changing, and with LLMs/AI, the goalposts are changing, too. With massive economies of scale and huge output, the goal is less and less good engineering and more and more pure output.

Put it another way, the post-AI world considers a reduction of quality to be an acceptable tradeoff for the massive short-term gains. Moreover, although quantity over quality is not exactly a new concept, what AI does is magnify and distill that concept as a prime directive.

Code doesn't have to work well to be profitable given how cheaply in can be produced, and that's a general effect of technology that can reach global audiences, and AI is the apex of that technology.

I don't think there is a nuanced view: AI is an all-around bad thing and its positives in the short-term will be vastly dwarfed by its negatives except for those who have become specialized at concentrating wealth at the expense of the common good.

karn97•5mo ago

Even the latest models are terrible, absolute garbage at engineering. They can't engineer solutions at all. Even in the simplest of cases.

Tells me a lot about people like you who make these comments rather than llms.

vouaobrasil•5mo ago

Not sure what you mean by your snarky comment but I never claimed that LLMs produced any good solutions.

rTX5CMRXIfFG•5mo ago

I agree with your main point, and almost as if to nitpick: more and more output (specifically in the context of LLM-produced code) is paradoxical to economies of scale, precisely because LLMs make too many mistakes at scale. So then we necessarily have to limit their use to relatively simple and mundane tasks. That is, of course, still valuable in itself, but history teaches us that there _will_ be a mass movement to building large, complex, risky products using LLMs.

I don't believe that that's something that we can stop. Somehow, I feel like the popularization of LLMs is a force of natural selection where those who are smart enough to keep training their minds will find themselves more financially secure than those who don't, and therefore more likely to survive.

vouaobrasil•5mo ago

> I don't believe that that's something that we can stop. Somehow, I feel like the popularization of LLMs is a force of natural selection where those who are smart enough to keep training their minds will find themselves more financially secure than those who don't, and therefore more likely to survive.

Yes, that's exactly right, it's the prisoner's dilemma. You articulated it perfectly.

0xAFFFF•5mo ago

> the goal is less and less good engineering and more and more pure output

Code is a liability. Code is expensive to maintain, has bugs, security issues, performance issues. The short-term profitable solutions will have a very narrow window to succeed because they will quickly crumble under their own weight.

sensanaty•5mo ago

> Input Risk. An LLM does not challenge a prompt which is leading...

(Emphasis mine)

This has been the biggest pain point for me, and the frustrating part is that you might not even realize you're leading it a particular way at all. I mean it makes sense with how LLMs work, but a single word used in a vague enough way is enough to skew the results in a bad direction, sometimes contrary to what you actually wanted to do, which can lead you down rabbit holes of wrongness. By the time you realize, you're deep in the sludge of haphazardly thrown-together code that sorta kinda barely works. Almost like human language is very vague and non-specific, which is why we invented formal languages with rules that allow for preciseness in the first place...

Anecdotally, I've felt my skills quickly regressing because of AI tooling. I had a moment where I'd reach out to it for every small task from laziness, but when I took a real step back I started realizing I'm not really even saving myself all that much time, and even worse is that I'm tiring myself out way quicker because I was reading through dozens or hundreds of lines of code, thinking about how the AI got it wrong, correcting it etc. I haven't measured, but I feel like in grand totality, I've wasted much more time than I potentially saved with AI tooling.

I think the true problem is that AI is genuinely useful for many tasks, but there are 2 camps of people using it. There are the people using it for complex tasks where small mistakes quickly add up, and then the other camp (in my experience mostly the managerial types) see it shit out 200 lines of code they don't understand, and in their mind this translates to a finished product because the TODO app that barely works is good enough for an "MVP" that they can point to and say "See, it can generate this, that means it can also do your job just as easily!".

To intercept the usual comments that are no doubt going to come flooding in about me using it wrong or trying the wrong model or whatever, please read through my old comment [1] for more context on my experience with these tools.

[1] https://news.ycombinator.com/item?id=44055448

johnecheck•5mo ago

Nonono you've got it all wrong you clearly need to try [model name] with [coding product], it 100x'd my productivity!

bandoti•5mo ago

From my experience so far it is helpful for me to get another opinion on how to solve a problem—and I do the work in the end. Or, I am extremely specific, and give it a relatively small problem to solve, and it solves it—writes the code for me—and then I code review it, and make changes to uphold my standards.

In other words, AI is my assistant, but it is MY responsibility to turn up quality, maintainable work.

However, to put things in perspective for the masses: just consider the humble calculator. It has ruined people’s ability to do mental math. AI is going to do that for writing and communication skills, problem solving skills, etc.

sensanaty•5mo ago

> From my experience so far it is helpful for me to get another opinion on how to solve a problem—and I do the work in the end.

I agree fully, I use it as a bouncing off point these days to verify ideas mostly.

The problem is, and I'm sure I'm not alone in this, management is breathing down my neck to use AI for fucking everything. Write the PR with AI, write the commit message with AI, write the code, the tests, use YOUR AI to parse MY AI's email that I didn't bother proofreading and has 4 logical inconsistencies in 1 sentence. Oh this simple feature that can easily be done for cheaper, quicker and easier without AI? Throw an AI at it! We need to sell AI! "You'll be left in the dust if you don't adopt AI now!"

It comes back to my point about there being 2 camps. The one camp actually uses AI and can see their strengths & weaknesses clear as day and realizes it's not a panacea to be used for literally everything, the other is jumping headfirst into every piece of marketing slop they come across and buying into the false realities the AI companies are selling them on.

karn97•5mo ago

Almost all of ai hype seems to be on the backs of no coders or people who make toys as a career (if they even have one)

intended•5mo ago

I consider this hype the preparatory phase for the era where we rediscover the laws of quality control.

A GREAT example is good old Coke vs Pepsi.

dematz•5mo ago

Definitely share your feeling that people move the goalposts from "AI can do it" to "well it would have been able to do it if you used model o2.7 in an IDE with RAG and also told it how to do it in the prompt" ...ok, at some point it's less value for the effort than writing the code myself, thanks

That said, AI does make some things easier today, like if you have an example to use for "make me a page like this but with data from x instead of y". Often it's faster than searching documentation, even with the caveat that it might hallucinate. And ofc it will probably improve over time.

The particular improvement I'd like to see is (along with in general doing things right) finding the simplest solution without constantly having to be told to do so. My experience is the biggest drawback to letting chatgpt/claude/etc loose is quickly churning out a bunch of garbage, never stopping to say this will be too complex to do anything with in the future. TFA claims only humans can resist entropy by understanding the overall design; again idk if that will never improve but it feels like the big problem right now.

sklarsa•5mo ago

> but a single word used in a vague enough way is enough to skew the results in a bad direction

I'm glad I'm not the only one who feels this way. It seems like these models latch on to a particular keyword somewhere in my prompt chain and throw traditional logic out the window as they try to push me down more niche paths that don't even really solve the original problem. Which just leads to higher levels of frustration and unhappiness for the human involved.

> Anecdotally, I've felt my skills quickly regressing because of AI tooling

To combat this, I've been trying to use AI to solve problems that I normally would with StackOverflow results: for small, bite-sized and clearly-defined tasks. Instead of searching "how to do X?", I now ask the model the same question and use its answer as a guide to solving the problem instead of a canonical answer.

adriano_f•5mo ago

What I've been doing when I want to avoid this "unexpected leading", is to tell the LLM to "Ask me 3 rounds of 5 clarifying questions each, first.". The first round usually exposes the main assumptions it's making, and from there we narrow down and clarify things.

I've read you comment about all the things you tried, and it seems you have much broader experience with LLMs than I do. But I didn't see this technique mentioned, so leaving this here in case it helps someone else :).

croemer•5mo ago

People have similar biases, it's just a lot easier to test an LLM for biases than it is to test humans so we are more aware of LLM biases.

bearjaws•5mo ago

Let's be real, 70% of staff are phoning in their jobs so badly that an AI often is just as good if not better.

The real struggle will be, the people phoning it in are still going to be useless, but with AI. The rest will learn and grow with AI.

api•5mo ago

For large companies this is true, and this may be the best AI coding take I have read.

It's similar with full self drive. FSD is better than a bad, drunk, or texting human driver, and that's a lot of the drivers on the road.

pornel•5mo ago

FSD in principle could be, but the overpromised misnomer we have right now isn't. Being better than a drunk driver isn't good enough when it's also worse than a sober driver. The stats of crashes per mile are skewed by FSD being mainly used in easy conditions, and not for all driving.

There are real safety improvements from ADAS. For safety you only need crash avoidance, not a full-time chauffeur.

90s_dev•5mo ago

It's almost like people outsourcing their job to AI are asking to get fired, not only by proving that a computer program can do their job better, but even paving the way for it!

klabb3•5mo ago

Don’t forget that they give away the data the AI needs for training.

highstep•5mo ago

Your job will disappear even faster with your head that deep in the sand. At least learning the new tools you can carve out a new role/career for yourself.

90s_dev•5mo ago

Meh, I haven't worked in software in years, so I'm fine.

zikzikzik•5mo ago

Famous last words?

throw4847285•5mo ago

That's an extremely self-serving narrative. I assume you're part of the 30 percent?

pixl97•5mo ago

It's probably better to look at a group from outside. Every company of any size seems to accumulate at least some people that could be replaced with a small shell script. Where I work there are a few people that seem so questionable at their job (even though most are good) I wonder how they keep their positions. I'd rather work with AI for the rest of my life then have to deal with them again.

throw4847285•5mo ago

You can alter a work environment to allow some of those people to thrive (though I'm not naive enough to think that all of them will). But once somebody suited to their environment becomes convinced that their success is totally due to their own overwhelming greatness, they become impossible to work with.

I would rather work in an office entirely staffed by well-meaning people struggling at their jobs than a single person like you.

bearjaws•5mo ago

Whether I'm in the 30% or not isn't the core issue, is it? The point is about the impact AI will have based on existing work ethics. Many of us have seen colleagues who barely contribute, and AI is a tool that will either be leveraged for growth by the engaged or used as another crutch by those already disengaged.

throw4847285•5mo ago

I believe AI is going to degrade everybody's work ethic, regardless of current performance. I don't think it's going to act like some kind of Calibration Panel where the bottom performers are knocked out by AI, leaving the higher performers to keep cruising along confident in their skills. I believe it will make everybody worse.

paradite•5mo ago

> LLMs as they currently exist cannot master a theory, design, or mental construct because they don't remember beyond their context window. Only humans can can gain and retain program theory.

False.

> An LLM is a token predictor. It works only at the level of text. It is not capable of working at a conceptual level: it doesn't reason about ideas, diagrams, or requirements specifications.

False.

Anyone who have spent time in machine learning or reinforcement learning understands that models are projections of higher dimension concepts on to lower dimensions as weights.

mjburgess•5mo ago

The problem with people "who have spent time in machine learning or reinforcement learning" is that they've spent no time, literally none, understanding what a concept is.

There is no such thing as a higher dimensional concept, nor can they be projected into a weight space, because they aren't quantities.

The concept, say, "Dog" composes with the concept, "Happy" to form "Happy Dog". The extension(Dog) is all possible dogs, the extension(Happy) is all happy objects, the extensions here compose. The intension of "Dog" depends on its context, eg., "get that damned dog!" has a different intension than, "I wish I looked less like a dog!". And the intensions here do not compose like the extensions.

Take the act of counterfactual reality-oriented simulation, called "imagination" narrowly, call that `I`. And "curry" this operator with a simulation premise, "(as if) on mars", so we have I' = I(Mars)(...).

Now, what is the content of I'(Dog), I(Happy Dog), I(Get that damned dog), I(get that damned happy dog) ? and so on

The contents of `I` is nowhere modelled by "projection" because this does not model composition, and is not relevantly discrete and bounded by logical connectives.

These are trivial issues which arise the moment you're aware that few writing computer science papers have ever studied the meaning of the words they use with abandon.

paradite•5mo ago

You seem to believe "concept" is a concept in English language. It is not.

Concept is an abstraction layer above human languages.

Here's a good article that touched on this topic: https://www.neelnanda.io/mechanistic-interpretability/glossa...

mjburgess•5mo ago

I'm very well-familiar with the literature in this area. I understand that computer scientists, with obscene and wild abandon, will just pick whatever word suits their agenda and define it opportunistically, without regards to the confusion it will cause -- indeed, seemingly with this aim -- to "impress" the reader and make their research seem extraordinary.

"Concept" is not a term from computer science, its use here has not only been "narrowed" but flat-out redefined. "Concept" as used in XAI (a field in which i've done research) is an extremely limited attempt to capture the extension of concepts over a given training domain.

Concept, as used by the author of this article that you are replying to, and 99.9999999...% of all people familiar with the term, means "concept". It does not mean what it has been asserted to mean in XAI.

And one of the most basic features of concepts is their semantic content, that they compose, that they form parts of propositions, and so on.

paradite•5mo ago

You have biased view on the definition of "concept" based on English language and logic.

In Chinese language, concept is 概念.

In Chinese language, happy dog is 快乐的狗.

Notice it has an extra "的" that is missing in English language. This tells you that you can't just treat English grammar and structure as the formal definition of "concept". Some languages do not have words for happiness, or dog. But that doesn't mean the concept of happiness or dog does not exist.

The reverse is also true, you can't claim a concept does not exist if it does not exist in English language. Concept is something beyond any particular language or logical construct or notations that you invent.

mjburgess•5mo ago

> But that doesn't mean the concept of happiness or dog does not exist.

That would be a consequence of your position.

The person who wrote the article is english. The claim being evaluated here is from the article. The term "concept" is english. The *meaning* of that term isn't english, any more than the meaning of "two" is english.

My analysis of "concept" has nothing to do with the english language. "Happy" here stands in for any property-concept and 'dog' any term which can be a object-concept, or a property-concept, or others. If some other language has terms which are translated into terms that do not function in the same way, then that would be a bad translation for the purpose of discussing the structure of concepts.

It is you who are hijacking the meaning of "concept", ignoring the meaning the author intended, substituting one made up 5 minutes ago by self-aggrandising poorly read people in XAI -- and the going off about irrelevant translations into Chinese.

The claim the author made has nothing to do with XAI, nor chinese, nor english. It has to do with mental capacities to "bring objects under a concept", partition experience into its conceptual structure ("conceptualise"), simulate scenarios based on compositions of concepts ("the imagination") and so on. These are mental capabilities a wide class of animals possess, who know no language; that LLMs do not possess.

paradite•5mo ago

Okay maybe I need to make myself more clear, and start from your claim:

> It has to do with mental capacities to "bring objects under a concept", partition experience into its conceptual structure ("conceptualise"), simulate scenarios based on compositions of concepts ("the imagination") and so on. These are mental capabilities a wide class of animals possess, who know no language; that LLMs do not possess.

Assume it is true that humans are capable of these capabilities, why do you think LLMs are not capable of them? We don't know if they are capable of these capabilities, and that's what explainable ai is.

Take a step back and assume that LLMs are not capable of these capabilities, how do you prove that these are the fundamental concepts in the universe, instead of projections of higher level concepts from higher dimensional space that humans and animals are not aware of? What if there exist a more universal set of concepts that contains all the concepts we know and others, in a higher dimension, and both LLMs and humans are just using the lower dimension projections of such concepts?

mjburgess•5mo ago

I've answered a similar question elsewhere in this thread ( https://news.ycombinator.com/threads?id=mjburgess , presently 3rd from the top).

rhubarbtree•5mo ago

If the effectiveness of LLMs has taught me anything, it’s that concepts are quantities, at least in the sense that they are vectors.

Good job too, otherwise the brain would have to magically create some entirely new physics to represent concepts.

viraptor•5mo ago

I'm not sure why "The contents of `I` is nowhere modelled by "projection" because this does not model composition, and is not relevantly discrete and bounded by logical connectives."

In practical terms, what do you think the LLM output cannot contain right now? Because the way I read it now is "LLM can't speculate". But that's trivial to disprove by asking for that happy dog on Mars speculation you have as an example - whether you want the scientific version, or child level fun, it's available and the model will give nontrivial idea connections that I could not find anywhere. (For example childlike speculation from Claude included that maybe dogs would be ok playing in spacesuits since some dogs like wearing little coats)

Similarly "And the intensions here do not compose like the extensions." is really high level. What's the actual claim with regards to LLMs?

mjburgess•5mo ago

LLMs are just a token->token mapping. They can output any set of tokens for any set of input tokens. So there is no output which isn't in the domain or codomain.

The issue is why one (prompt, answer) pair is given. If the answer is given as a "reasoning process" over salient parts of the prompt, that, e.g., involves imagining/simulation as expected, then for {(prompt', answer')} of similar imaginings we will get reliable mappings. If its cheating, then we wont.

We can, I think, say for certain that the system is not engaged in counterfactual reasoning. Eg., we can give a series of prompts (p1, p2, p3...) which require increasing complexity of the imagined scenario, and we do not find O(answering) to follow O(p-complexity-increase). Rather the search strategy is always the same, and we can just get "mildly above linear" (pseudo-)reasoning complexity with chain-of-thought.

viraptor•5mo ago

> LLMs are just a token->token mapping. They can output any set of tokens for any set of input tokens. So there is no output which isn't in the domain or codomain.

This applies the same to humans hearing a question and responding. Tokens in, tokens out (whether words or sound). It's not unique to LLMs, so not useful for explaining differences.

> then for {(prompt', answer')} of similar imaginings we will get reliable mappings. If its cheating, then we wont.

You're not really showing that this is/isn't the case already. Also this would put people with quirky ideas and wild imagination in the "cheating" group if I understand your claim correctly. There's even a whole game around a similar concept - Dixit - describe an image in a way that as few people as possible will get it.

> we can give a series of prompts (p1, p2, p3...) which require increasing complexity of the imagined scenario, and we do not find O(answering) to follow O(p-complexity-increase). Rather the search strategy is always the same

You're describing most current implementations, not a property of LLMs. Gemini scales the thinking phase for example. Future models are likely to do the same. Another recent post implemented this too https://news.ycombinator.com/item?id=44112326

mjburgess•5mo ago

All physical systems have "intrinsic properties" and "measure properties".

Eg., coffee has some internal kinetic energy in the motion of its molecules, and it has the disposition to cause a thermometer to rise its mercury to a certain height.

There's always an open question in these cases: is the height of the mercury a "reliable stand-in" for the temperature of the system? In many case: NO. If you read-off the height too quickly, you'll report the wrong temperature.

No system's intrinsic properties is, litearlly, just its measure properties. We are not literally our behaviours. An LLM is not literally its input/output tokens.

The question arises: what is the actual intrinsic property which gives rise to the measured properties?

it's very easy to see why people believe that the LLM case is parallel to the human case, becuse in ordinary circumstances, our linguistic behvaviours are "reliable measures" of our mental states. So we apply the same perception to LLMs: so to must they generate outputs in the way we do, they must "Reason".

However, there are many much more plausible explanations of how LLMs work that do not resort to giving them mental capacities. And so what's left to those in possession of such better explanations is to try to explain to others why you cannot just put a thermometer into a xbox cd drive and think you're measuring how hot the player is.

TeMPOraL•5mo ago

Yup. That's exactly what language models represent internally; that's what the high-dimensional latent space is exactly about - reifying meaning, defining concepts in terms of relationships to other concepts.

LLMs are the idea you describe but made incarnate, in form of a computing artifact we can "hold in our hands", study and play with. IMHO people are still under-appreciating how big a thing this is fundamentally, beyond RAG and chatbots.

mjburgess•5mo ago

Sure, they're a reification of some aspect of meaning. The question is: which aspect(s), and which not.

It is also the case that animals do not "reliably and univerally" implement all aspects of all meanings they are acquainted with, so we aren't looking for 100% of capacities, 100% of the time.

Nevertheless, LLMs are only implementing a limited aspect of meaning: mostly association and "some extension". And with this, plus everything ever written, they can narrowly appear to implement much more.

Let's be clear though, when we say "implement" we mean that an answer arises from a prompt for a very specific reason: because the answer is meant by the system in the relevant way. In this sense, LLMs can mean any association, perhaps they can mean a few extensions, but they cannot mean anything else.

Whenver an LLM appears to partake in more aspects of meaning it is only cheating: it is using familiarity with families of associations to overcome its disabilities.

Like the idiot savant who appears to know all hollywood starlets, but is discovered eventually, not to realise they are all film stars. We routinely discover these disabilities in LLMs, when they attempt to engage in reasoning beyond these (formally,) narrow contexts of use.

Agentic AI is a very good "on steroids" version of this. Just try to use windsurf, and the brittle edges of this trick appear quickly. It's "reasononing" whenver it seems to work, and "hallucination" when not -- but of course, it just never was reasoning.

TeMPOraL•5mo ago

> LLMs are only implementing a limited aspect of meaning: mostly association and "some extension".

> Whenver an LLM appears to partake in more aspects of meaning it is only cheating: it is using familiarity with families of associations to overcome its disabilities.

I'm not convinced there's anything more to "meaning" - we seem to be defining concepts through relationship to other concepts, and ground that directly or indirectly with experiences. The richer that structure is, the more nuanced it gets.

> Like the idiot savant who appears to know all hollywood starlets, but is discovered eventually, not to realise they are all film stars. We routinely discover these disabilities in LLMs, when they attempt to engage in reasoning beyond these (formally,) narrow contexts of use.

I see those as limitations of degree, not kind. Less idiot savant, more like someone being hurried to answer questions on the spot. Some associations are stronger and come to mind immediately, some are less "fresh in memory", and then associations can bring false positives and it takes extra time/effort to notice and correct those. It's a common human experience, too. "Yes, those reserved words are 'void', 'var', 'volatile',... wait, 'var' is JS stuff, it's not reserved in C..." etc.

Then, of course, humans are learning continuously, and - perhaps more importantly - even if they're not learning, they're reinforcing (or attenuating) existing associations through bringing them up and observing feedback. LLMs can't do that on-line, but that's an engineering limitation, not a theoretical one.

I'm not claiming that LLMs are equivalent to humans in general sense. Just that they seem to be implementing the fundamental machinery behind "meaning" and "understanding" in general sense, and the theoretical structure behind it is quite pretty, and looks to me like a solution to a host of philosophical problems around meaning and language.

mjburgess•5mo ago

Is this based on analogising LLMs to animal mental capacities, or based on a scientific study of these capacities? ie., is this confirmation bias, or science?

One can always find a kind of confirmation bias analysis here, which "saves the appearances", ie., one can always say "take a measurement set of people's mental capacities, given in their linguistic behaviour" and find such behaviours apparent in LLMs. This will always be possible for the obvious reason that LLMs are trained on human linguistic practice.

This makes "linguistic measurement" of LLMs especially deceptive. Consider the analogous case of measuring a video games by it's pixels: does it really have a "3d space" ? No. It only appears to. We know that pixel-space measurements of video games are necessarily deceptive, because we constructed them that way, so it is obvious that you cannot "walk into a tv".

Yet we did not construct the mechanism of deception in LLMs, making seeing thru the failure of "linguistic measurement" apparently somewhat harder. But I imagine this is just a matter of time -- in particular, when LLM's mechanisms are fully traced, it will be more obvious that their outputs are not generated for the reasons we suppose. That the "reason to linguistic output" mapping we use on people is deceptive as applied to LLMs. Just as a screenshot of a video game is a deceptive measure, whereas a photograph isnt. For a photograph, the reason the mountain is small is because its far away; for a screenshot, it isnt: there is no mountain, it is not far away from the camera, there is no camera.

In the case of LLMs we know they cannot mean what they say. We know that if an LLM offers a report on new york it cannot mean what a person who has travelled to new york means. The LLM is drawing on an arrangment, in token space, of tokens placed there by people who have been to new york. This arrangement is like the "rasterization" of a video game: it places pixels as-if there were 3d. You could say, then, that an LLM's response is a kind of rasterization of meaning.

And just as with a video game, there are failures, eg., clipping through "solid" objects. LLMs do not genuinely compose concepts, because they have no concpets -- they can only act as if they are composing them, so long as a token-space measurement of composition is available in the weight-space of the model. (And so on...)

The failures of LLMs to have these capacities will be apparent after awhile, at the moment we're on the hype rollercoaster, and its not yeet peaked. At the moment, people are still using the "reason-lingusitic" mapping theyve learned from human communication on LLMs, to impart the relevant menetal states they would with people. The boundaries of the failure of this mapping isnt yet clear to everyone. Users don't yet avoid "clipping thru" objects, beause they can't understand what clipping is -- at the moment, many seem to be desperate to say that if a video game object is clipped thru, it must be designed to be hollow.

In any case, as i've said in many places in this thread (which you can see from my recent comment history) -- there are a large variety of mental capacities associated with apprehending meaning that LLMs lack. But the process is anti-inductive so it will take quite awhile: for all those who are finding the fragile boundaries ("clipping thru the terrian") new models come out with invisible walls.

TeMPOraL•5mo ago

> Is this based on analogising LLMs to animal mental capacities, or based on a scientific study of these capacities? ie., is this confirmation bias, or science?

- On how embeddings work;

- On the observation that in very high-dimensional space you can encode a lot of information in relative arrangement of things;

- On the observation that the end result (LLMs) are too good at talking and responding like people in nuanced way for this to be uncorrelated;

- On noticing similarities behind embeddings in high-dimensional spaces and what we arrive when we try to express what we mean by "concept", "understanding" and "meaning", or even how we learn languages and acquire knowledge - there's a strong undertone of defining things in terms of similarity to other things, which themselves are defined the same way (recursively). Naively, it sounds like infinite regress, but it's exactly what embeddings are about.

- On the observation that the goal function for language model training is, effectively, "produce output that makes sense to humans", in fully general meaning of that statement. Given constraints on size and compute, this is pressuring the model to develop structures that are at least functionally equivalent to our own thinking process; even if we're not there yet, we're definitely pushing the models in that direction.

- On the observation that most of the failure modes of LLMs also happen to humans, up to and including "hallucinations" - but they mostly happen at the "inner monologue" / "train of thought" level, and we do extra things (like explicit "system 2" reasoning, or tools) to fix them before we write, speak or act.

- And finally, on the fact that researchers have been dissecting and studying inner workings of LLMs, and managed to find direct evidence of them encoding concepts and using them in reasoning; see e.g. the couple major Anthropic studies, in which they demonstrated the ability to identify concrete concepts, follow their "activations" during inference process, and even control the inference outcome by actively suppressing or amplifying those activations; the results are basically what you'd expect if you believed the "concepts" inside LLMs were indeed concepts as we understand them.

- Plus a bunch of other related observations and introspections, including but not limited to paying close attention to how my own kids (currently 6yo, 4yo and 1.5yo) develop their cognitive skills, and what are their failure modes. I used to joke that GPT-4 is effectively a 4yo that memorized half the Internet, after I noticed that stories produced by LLMs of that time and those of my own kid follow eerily similar patterns, up to and including what happens when the beginning falls out of the context window. I estimated that at 4yo, my eldest daughter had a context window of about 30s long, and I could see it grow with each passing week :).

That's in a gist, what adds up to my current perspective on LLMs. Might not be hard science, but I find a lot of things pointing in the direction of us narrowing down on the core functionality that also exists in our brain (but not the whole thing, obviously) - and very little that would point otherwise.

(I actively worry that it might be my mental model is too "wishy washy" and lets me interpret anything in a way that fits it. So far, I haven't noticed any warning signs, but I did notice that none of the quirks or failure modes feel surprising.)

I'm not sure if I got your videogame analogy the way you intended, but FWIW, we also learn and experience lots of stuff indirectly; the whole point of language and communication is to transfer understanding this way - and a lot of information is embodied in the larger patterns and structures of what we say (or don't say) and how we say it. LLM training data is not random, it's highly correlated with human experience, so the information for general understanding of how we think and perceive the world is encoded there, implicitly, and at least in theory the training process will pick up on it.

I don't have a firm opinion on some of the specifics you mention, just couple general heuristic/insights that tell me it could be possible we narrowed down on the actual thing our own minds are doing:

1. We don't know what drives our own mental processes either. It might be we discover LLMs are "cheating", but we might also discover they're converging to the same mechanisms/structures our own minds use. I don't have any strong reason to assume the former over the latter, because we're not designing LLMs to cheat.

2. Human brains are evolved, not designed. They're also the dumbest possible design evolution could arrive at - we're the first to cross the threshold after which our knowledge-based technological evolution outpaced natural evolution by orders of magnitude. All we've achieved to date, we did with a brain that was the nature's first prototype that worked.

3. Given the way evolution works - small, random, greedy increments that have to be incrementally useful at every step - it stands to reason that whatever the fundamental workings of a mind are, they must not be that complicated, and they can be built up incrementally through greedy optimization. Humans are a living proof of that.

4. (most speculative) It's unlikely there are multiple alternative implementations of thinking minds that are very different from each other, yet all are equally easy to reach through random walk, and that evolution just picked one of those and run with it. It's more likely that, when we get to that point (we might already be there), we'll find the same computational design nature did. But even if not, diffing ours and nature's solution will tell us much about ourselves.

mjburgess•5mo ago

> On the observation that most of the failure modes of LLMs also happen to human

That's assuming that LLMs operate according to how we read their text. What you're doing is reading llm chain-of-thought as-if said by a human, and imparting the implied capacities that would be implied if a human said it. But this is almost certainly not how LLMs work.

LLMs are replaying "linguisitc behaivour" which we take, often accurately, to be dispositive of mental states in people. They are not evidence of mental capacities and states in LLMs, for seemingly obvious reasons. When a person says, "I am hungry" it is, in verdical cases, caused by their hunger. When an LLM says it the cause is something like, "responding appropriately, accoring to a history of appropriate use of such words, on the occasion of a prompt which would, in ordinary historical cases, give this response".

The reason an LLM generates a text prima fascie never involves any associated capacities which would have been required for that text to have been written in the first place. Overcoming this leap of logic requires vastly more than "it seems to me".

> On how embeddings work

The space of necessary capacities is no exhausted by "embedding", by which you mean a (weakly) continuous mapping of historical exemplars into a space. Eg., logical relationships, composition, recursion, etc. are not mental capacities which can be implemented this way.

> We don't know what drives our own mental processes either.

Sure we do. At the level of enumerating mental capacities, their operation and so on, we can give very exhaustive lists. We do not know how even the most basic of these is implemented biologically, save I believe, we can say quite a lot about how properties of complex biological systems generically enable this.

But we have a lot of extremely carefully designed experiments to show the existence of relevant capacities in other animals. None of these experiments can be used on an LLM, because by design, any experiment we would run would immediately reveal the facade: any measurement of the GPU running the LLM and its environmental behaviour shows a total empirical lack of anything which could be experimentally measured.

We are, by the charaltan's design, only supposed to use token-in/token-out as "measuremnt". But this isn't a valid measure, becuase LLMs are constructed on historical cases of linguistic behaviour in people. We know, prior to any experiment, that the one thing designed to be a false measure, is the lingustic behaviour of the LLM.

Its as if we have constructed a digital thermometer to always replay historical temperature readings -- we know, by design, that these "readings" are therefore never indicative of any actual capacity of the device to measure temperature.

TeMPOraL•5mo ago

Program theory is the Stochastic Parrot argument of 2025. Suddenly everyone is name-dropping Naur and quoting the same bit of his seminal essay, then pointing at it and saying "this!", without providing any sort of coherent argument why would that point to LLM limitations, or be relevant to the topic in the first place.

Willingham•5mo ago

> How often have you witnessed an LLM reduce the complexity of a piece of code?

I would add that AI often makes far too clever code as well, and would defer to Kernighan's law: “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.”

injidup•5mo ago

Often. It's often as simple as adding to the prompt to prefer to use standard libraries than recoding algorithms from scratch. Or do a second pass on something that you feel might be copy pasting ideas. And adding to the prompt doesn't mean always having to type it. I have a set of canned things that get added to every request which includes, coding standards and requests to use standard api. You often get what you ask for not what you want.

AnimalMuppet•5mo ago

It's worse. If you weren't clever enough to write the code in the first place, and so you used an LLM to tell you what to do, you definitely aren't clever enough to debug it.

The LLM may or may not be clever enough, but you aren't clever enough to evaluate its debugging.

90s_dev•5mo ago

It's occurred to me more and more that I need to repurpose 90s.dev into a non-AI community, focused on the ancient art of writing software well, and welcoming to all who honed that craft. What would it need to start with? Forum? Mailing list? Aggregate blog from multiple authors like hackernoon?

[edit] Makeshift mailing list for those interested: https://github.com/sdegutis/90s.dev/issues/2 (subscribe or comment to "sign up for emails" on the topic)

api•5mo ago

Use an OG forum software that's good and powered some of the classic forums of the time. Make it stylishly retro.

90s_dev•5mo ago

Definitely. Honestly I was going to write a custom forum in the style of those perl bboards but with https://90s.dev aesthetics (the website, not the os).

But maybe first and foremost I need a mailing list so people can be notified of things like this when they're announced/released?

api•5mo ago

I'm interested. I don't think it should just be an older coders club though. It should be for anyone of any age that's into the craft of fine, well built software, and who wants to learn from people who have been into it a long time.

I also like /r/tinycode for its spirit. My #1 saying in coding is "simplicity is harder than complexity." It's gone downhill like the rest of Reddit though.

I'm also not totally anti-AI. I use it a little bit. I just think if you aren't a good developer you aren't competent to use it properly. It's like saying autocomplete will make a bad developer good. I think it's like super autocomplete. Also found it useful for "here's a big block of code, explain it" -- you have to check what it tells you of course, but it gives you a passable first pass analysis, like asking a junior dev to do that.

90s_dev•5mo ago

Yeah it wouldn't be age-based. It would be 90s-themed, but the focus would entirely be on the cultivating the craft itself.

To clarify the AI stance, I meant it in the context of the article: it would encourage cultivating our skills so it both grows and doesn't atrophy.

90s_dev•5mo ago

> My #1 saying in coding is "simplicity is harder than complexity."

That's a great way of putting what I've been thinking for a few years now. It's the same reason I designed https://immaculata.dev so differently than Vite. Sure I could throw a ton of code at the problem, but carefully solving the problem correctly is both simpler and harder.

> It's gone downhill like the rest of Reddit though.

Exactly, which is part of the charm of HN. I want to capture that for 90s.dev but focused solely on software skill cultivation (and sharing the wonder and beauty and joy of writing software in the 90s) rather than the topic-soup of HN.

an_aparallel•5mo ago

Subscribe or comment?? Im in the target market for this. Nothing puts me off doing something than the S word, though forums are fucked due to LLMs and bots, so that option is out. The only way anything like this can work is invite only, with each referrer responsible for invite tree. Your community needs to be good enough that loosing access encourages goid behaviour. This works exceedingly well for certain communities online...

parsley•5mo ago

I call it fast-food software, but that doesn’t mean everyone holds the tool in the same way, the collaboration and use of an assistant to research or modify specific things is the key value-add.

But many people will use unrestrained tool-enabled agents to do all of their work. And so be it, there’s already bad software out there, always has been. It’s easier to make and distribute software. It’s high calorie and tastes great and is FAST.

But you have to hope it will be more of a tool than a full meal.

spacebanana7•5mo ago

Using AI effectively is like riding a horse or getting a monkey to do things for you. There's absolutely skill involved in getting them to do what you want, and used incorrectly they can create far more mess than any individual human. But we can also use them to achieve things that'd be almost impossible for us to do alone.

frenchmajesty•5mo ago

I disagree with the author here. His point can be summarized as "LLMs produce bad code, LLMs lead to the teams's average skills decreasing and they can't maintain large codebases over time".

Yet nowhere he addresses the #1 flaws to his position: rate of improvement of the technology, and its promise to deliver on saved money and gained speed.

In all the companies I've seen engineering leadership hardly really gives a shit about things OP says are important. They just care that customers are happy, the system is stable, and its malleable enough to allow for pivots when need be.

Good discussions & documentation about architecture before starting the work that gets peer-reviewed + A non-stupid engineer putting on good guardrails around the LLM's output + the extensive unit test suites in CD/CI + peer reviews on the PRs = all downsides near eliminated while all upside gained.

This is how we work at my company today (health startup). Google and Meta also boast publicly +30% of new lines of code are AI-generated in their companies today. That the state of *today*; assume in 5 years these AIs are 10x better... I simply cannot foresee a world where LLM-assisted coding is not the de-facto way be a software engineer.

zaphar•5mo ago

The LLM trend lines may indicate that further research, experimentation, and investment in LLMs is warranted. But it is a mistake to assume that trend lines predict the future. Yet this is the primary evidence that is raised in support of LLMs in response to critics.

It's not at all clear that the upside gained outweighs the cost in your hypothetical scenario nor is should it be taken as a given that the trend line will continue as is for long enough to create enough upside to outweigh that cost.

yahoozoo•5mo ago

Are they going to be 10x better though?

walterbell•5mo ago

> Assume in [N] years these AIs are 10x better

Why has OpenAI been acquiring "application" layer companies for large financial sums, instead of improving their own tools to build application layer codebases?

> 30% of new lines of code is AI-generated

"Watching AI drive Microsoft employees insane", 500 comments, https://news.ycombinator.com/item?id=44050152

lucideer•5mo ago

Your comment is talking about profitability, the op is talking about product quality. These two don't necessarily correlate.

Your examples of people "not giving a shit" about things the OP says are important & Google/Meta boasting about AI use are all revenue driven; people in leadership roles commonly place company revenue over product quality, in which case they shouldn't give a shit about the OP's topic.

As an engineer / IC I care about product quality because that's what gives me personal fulfillment. As a founder I care about product quality because I entered into this enterprise to solve a problem, not to sell a solution. Many people do the latter (very successfully) & this article isn't for those people. But it's relevant to me.

moooo99•5mo ago

> Google and Meta also boast publicly +30% of new lines of code are AI-generated in their companies today.

Both are companies with heavy investment into their AI products, hence an extremely biased view. I’d take that with a huge grain of salt.

> assume in 5 years these AIs are 10x better... I simply cannot foresee a world

Over the last 3ish years the improvements to the performance were significant, but incremental. What makes you think that the models will continue to improve at a substantially faster rate than today? Especially considering past releases have already demonstrated the diminishing returns with larger models and more computational power. Then there is also the pressure on the training data: downwards quality as well as ongoing litigation. From my POV there is more reason to believe the future development of LLMs will slow down somewhat rather than accelerate significantly.

bandoti•5mo ago

> Google and Meta also boast publicly +30% of new lines of code are AI-generated in their companies today.

You realize that Google and Meta sell AI products, right? So what you cited is effectively an ad campaign. Also the 30% NEW code is likely whittled down to 5-10% added to production after heavy edits. The devil is in the omitted details. :)

> In all the companies I've seen engineering leadership hardly really gives a shit about things OP says are important.

People put too much stock in “engineers” and “managers”. What we’re really talking about here is in the realm of sociology and psychology.

I think there’s a lot of evidence (just ask anyone in academia) that AI is already diminishing people’s ability to think for themselves.

There’s a lot of power in AI it’s true—but let’s not get blinded by the gold and leave everyone bloodied in the muck.

bsaul•5mo ago

"[AI] is not capable of working at a conceptual level".

I wonder where did author got that feeling. What recent LLMs proved time and time again is that they are definitely able to work at conceptual level (by correctly translating concepts from one language to another depending on the context for example). Saying it doesn't "understand" the concepts as humans do is a different thing. It wouldn't "understand" pain, because it has no experience of it. But humans constantly talk about thing they've never personally experienced (and indeed maybe they shouldn't, but that's another topic).

mjburgess•5mo ago

They work in a token space whose metrical structure is given by proxies for concepts. So at a point in this space I can "walk towards" points which cluster around the token "dog".

This is a weak model of some features of concepts, eg., association: "dog" is associated with "cat", etc. But it, e.g., does not model composition, nor intension, nor the role of the term in counterfactuals. (See my comment elsewhere in this comments section on this issue).

However you can always brute force your way to apparent performance in some apparently conceptual skill if the kinds of questions you ask are similar to the trainign data. So eg., if someone has asked, "if dogs played on mars, would they be happy?" etc. or similar-enough-families-of-questions... then that allows you to have a "dog" cluster around "literal facts" and a "dog" cluster around some subset of preknown counterfactuals.

If you want to see the difference between this and genuine mental capabilities, note that there are an infinite combination of concepts of abitary depth, which can be framed in an infinite number of counterfactauls, and so on. And a child armed with only those basic components, and the capacity for imagination, can evaluate this infinite variety.

This is why we see LLMs being used most by narrow fields (esp. software engineers) where the kinds of "conceptual work" that they need has been extremely well documented and is sufficiently stable to provide some utiltiy.

latentnumber•5mo ago

I would agree with this if the LLM never really modified the initial linear embeddings, but non-linearity in MLP layers and position/correlation fixing in the attention layers would mean that things are not so simple. I’m pretty sure there are papers showing compositionality and so on being represented by transformers.

bsaul•5mo ago

As always with definitive assertions regarding LLMs incapacities, i would be more convinced if one could demonstrate those assertions with an illustrative example, on a real LLM.

So far, the abilities of LLM to manipulate concepts, in practice, has been indistinguishable in practice from "true" human-level concept manipulation. And not just for scientific, "narrow" fields.

mjburgess•5mo ago

The problem with mental capacities, is that they are not measured by tests. We have no valid and reliable way of determining them. Hence why "metrics psychology" is a pseudoscience.

If I give a child a physics exam, and they score 100% it could either be because they're genuinely a genius (possessing all relevant capabilities and knowledge), or because they cheated. Suppose we dont know how they're cheating, but they are. Now, how would you find out? Certainly not by feeding more physics exams, at least, its easy enough to suppose they can cheat on those.

The issue here is that the LLM has compressed basically everything written in human history, and the question before us is "to what degree is a 'complex search' operation expressing a genuine capability, vs. cheating?"

And there is no general methodological answer to that question. I cannot give you a "test", not least because I'm required to give you it in token-in--token-out form (ie., written) and this dramatically narrows the scope of capability testing methods.

Eg., I could ask the cheating child to come to a physics lab and perform an experiment -- but I can ask no such thing from an LLM. One thing we could do with an LLM is have a physics-ignorant-person act as an intermediary with the LLM, and see if they, with the LLM, can find the charge on the electron in a physics lab. That's highly likely to fail with current LLMs, in my view -- because much of the illusion of their capability lies in the expertise of the prompter.

> has been indistinguishable in practice from "true" human-level concept manipulation

This claim indicates you're begging the question. We do not use the written output of animal's mental capabilities to establish their existence -- that would be a gross pseudoscience; so to say that LLMs are indistinguishable from anything relevant indicates you're not aware of what the claim of "human-level concept manipulation" even amounts to. It has nothing to do with emitting tokens.

When designing a test to see if an animal possesses a relevant concept, can apply it to a relevant situation, can compose it with other concepts, and so on -- we would never look to linguistic competence, which even in humans, is an unreliable proxy: hence the need for decades of education and the high fallibility of exams.

Rather if I were assessing "does this person understanding 'Dog'?" I would be looking for contextual competence in application of the concept in a very broad role in reasoning processes: identification in the environment, counterfactual reasoning, composition with other known concepts in complex reasoning processes, and the like.

All LLMs do is emit text as-if they have these capacities, which makes a general solution to exposing their lack of them, basically methodologically impossible. Training LLMs is an anti-inductive process: the more tests we provide, the more they are trained on them, so the tests become useless.

Consider the following challenge: there are two glass panels, one is a window; and the other is a very high def TV showing a video game simulation of the world outside the window. You are fixed at a distance of 20 meters from the TV, and can only test each glass pane by taking a photograph of it, and studying the photograph. Can you tell which window is the outside? In general, no.

This is the grossly pseudoscientific experimental restriction people who hype LLMs impose: the only tests are tokens-in, tokens-out -- "photographs taken at a distance". If you were about to be throw against one of these glass panels, which would you choose?

If an LLM was, based on token in/out analysis alone, put in charge of a power plant: would you live near by?

It matters if these capabilities exist, because if real, the system will behave as expected according to capabilities. If its cheating, when you're thrown against the wrong window, you fall out.

LLMs are in practice, incredibly fragile systems, whose apparent capabilities quickly disappear when the kinds of apparent reasoning they need to engage in are poorly represetned in their training data.

Consider one way of measuring the capability to imagine that isnt token/token: energy use and time-to-compute:

Here, we can say for certain that LLMs do not engaged in counterfactual reasoning. Eg., we can give a series of prompts (p1, p2, p3...) which require increasing complexity of the imagined scenario, eg., exponentially more diverse stipulations, and we do not find O(answering) to follow O(p-complexity-increase). Rather the search strategy is always the same for single-shot prompt: so no trace thru an LLM involves simulation. We can just get "mildly above linear" (apparent) reasoning complexity with chain-of-thought, but this likewise does not follow the target O().

The kinds of time-to-compute we observe from LLM systems are entirely consistent with a "search and synthesis" over token-space algorithm, that only appears to simulate if the search space contains prior exemplars of simulation. There is no genuine capability

bsaul•5mo ago

"…we would never look to linguistic competence".

On the contrary, i strongly believe that what LLM proved is the fact linguists have always told us about : that the language provides a structure on top of which we're building our experience of concepts (Sapir whorf hypothesis).

I don't think one can conceptualize much without the use of a language.

mjburgess•5mo ago

> I don't think one can conceptualize much without the use of a language.

Well a great swath of the animal kingdom stands against you.

LLMs have invited yet more of this pseudoscience. It's a nonesense position in an empirical study of mental capabilites across the animal kingdom. Something previuosly only believed by idealist philosophers of the early 20th century and prior. Now brought back so people can maintain their image in the face of their apparent self-deception: better we opt for gross pseudoscience than admit we're fooled by a text generation machine.

viraptor•5mo ago

> But humans constantly talk about thing they've never personally experienced

On the extreme, we can talk about things like Aphantasia, Synesthesia and colour blindness and understand the concepts even if we never experienced them.

bsaul•5mo ago

we understand the concept, but do we really intuitively understand it ? It happens very often that you "know" something, but then you experience it, and realize you never really understood it before.

progx•5mo ago

AI drive me crazy! For simple things they are ok, but with complex problems, they can't solve it and my blood pressure explodes. ;D

foxes•5mo ago

My most recent ai related peeve is people releasing some ai tool and claim that it's "democratising x". "democratising art", "democratising education" -- yeah giving all your data to and relying on a centralised model run by a fascist mega corp is so totally democratic. Maybe can't blame them -- that is what they actually mean based on "democracy" in the US etc though.

furyofantares•5mo ago

I'm quite big on using LLMs in coding, and aside from the inflammatory graphic at the top and the comment about loss of joy (I haven't lost my joy at all personally), I agree with the entire rest of the post completely. (For context I've been programming for 30 years now and still love it.)

I'm still trying to figure out how to produce a codebase using LLMs and end up as a expert in the system at the end of it while still coming out ahead. My hope is I can be more of an expert a bit faster than before, not less of an expert a lot faster.

It feels within reach to me as long as there's frequent periods of coming to a complete understanding of the code that's been produced and reshaping it to reduce complexity. As well as strong technical guidance as input to the LLM to begin with.

I think there's still a lot of learning to do about how to use these things. For example, I've tried LLM-heavy things in lua, C, and C#. The LLM was great at producing stuff that works (at first) in lua, but lua was miserable and the code ended up a huge mess that I can't be bothered to become an expert in. The LLM was really tripped up on C and I didn't make it that far, I didn't want to watch it fight the compiler so hard. C# has been great, the LLM is reasonably effective and I have an easy time consuming and reshaping the LLM code.

I've always liked static type systems but I like them even more now, in part because they help the LLM produce better code, but mostly because they make it a lot easier to keep up to speed on code that was just produced, or to simplify it.

lherron•5mo ago

This is my experience as well. I do not view the first shot LLM generated code as the end result. It’s more like a blob of clay I can continue to shape (with the LLM) until it meets my standards.

I also had a similar typed language experience, switching from untyped to type hinted python made the outputs much easier to understand and assess.

Artgor•5mo ago

You know, sometimes I feel that all this discourse about AI for coding reflects the difference between software engineers and data scientists / machine learning engineers.

Both often work with unclear requirements, and sometimes may face floating bugs which are hard to fix, but in most cases, SWE create software that is expected to always behave in a certain way. It is reproducible, can pass tests, and the tooling is more established.

MLE work with models that are stochastic in nature. The usual tests aren't about models producing a certain output - they are about metrics, that, for example, the models produce the correct output in 90% cases (evaluation). The tooling isn't as developed as for SWE - it changes more often.

So, for MLE, working with AI that isn't always reliable, is a norm. They are accustomed to thinking in terms of probabilities, distributions, and acceptable levels of error. Applying this mindset to a coding assistant that might produce incorrect or unexpected code feels more natural. They might evaluate it like a model: "It gets the code right 80% of the time, saving me effort, and I can catch the 20%."

tomrod•5mo ago

SWE use probability all the time. Rearchitect around that race condition or reduce its footprint? How long will this database call make, p99? A/B tests. Etc.

viraptor•5mo ago

The bigger the system, the more bigger the probability aspect gets too. What are the chances of losing at the data copies at the same time, what are the chances is all slots being full and the new connection dropped, what are the chances of data corruption from bitrot? You can mostly ignore that in toy examples, but at scale you just have chaos and money you can throw at it to reduce it somewhat.

jvanderbot•5mo ago

This has been about 50% of the time my experience as well. There are very good SWE who know how to use ML in real systems, and then there are the others who believe through and through it will replace well understood systems developed by subdomain experts.

As a concrete example, when I worked at Amazon, there were several really good ML-based solutions for very real problems that didn't have classical approaches to lean on. Motion prediction from grid maps, for example, or classification from imagery or grid maps in general. Very useful and well integrated in a classical estimation and control pipeline to produce meaningful results.

OTOH, when I worked at a startup I won't name, I was berated over and over by a low-level manager for daring to question a learning-based approach for, of all things, estimating orientation of a stationary plane over time. The entire control pipeline for the vehicle was being fed flickering, jumping, adhoc rotating estimate for a stationary object because the entire team had never learned anything fundamental about mapping or filtering, and was just assuming more data would solve the problem.

This divide is very real, and I wish there was a way to tease it out better in interviewing.

Mtinie•5mo ago

Your stationary plane example highlights a divide I've seen across my work experience in different domains; teams defaulting to ML when fundamental engineering would work better.

I'm curious: do you think there's any amount of high-quality data that could make the learning-based approach viable for orientation estimation? Or would it always be solving the wrong problem, regardless of data volume and delivery speed?

My sense is that effective solutions need the right confluence of problem understanding, techniques, data, and infrastructure. Missing any one piece makes things suboptimal, though not necessarily unsolvable.

whatshisface•5mo ago

One opportunity for human-designed systems to excel over machine learning is the case where treating ML as a black box has caused the designers to pose an impossible problem. From the parent comment, it sounds like each additional measurement was being related to a new estimate by the ML system, while the standard technique could integrate measurements over time (that's called filtering).

bsoles•5mo ago

> I've seen across my work experience in different domains; teams defaulting to ML when fundamental engineering would work better.

In my current field (predictive maintenance), there are (in)famous examples and papers using multi-layer deep networks for solving anomaly detection problems, where a "single" line of basic Matlab code (standard deviations, etc.) performs better than the proposed AI solution. Publish or perish, I guess...

ecshafer•5mo ago

The lack of knowledge of and application of fundamental engineering principles is a huge issue in the Software world. While its great that people can pick up programming and learn and get a job, I have noticed this is often correlated with people not having a background in Hard Science and Mathematics. Even amongst CS graduates there are a lot of who seem to get through without any mathematical or engineering maturity. Having a couple people in a team with Physics, Mathematics, Mechanical or Electrical Engineering backgrounds, etc. can really be a big asset as they can fight back and offer a classical solution that will work nearly 100% of the time. Whereas someone who just did a Bootcamp and no formal scientific training seems less likely to be able to grasp or have prior knowledge of classical approaches.

I think that this is one reason Software has such a flavor of the month approach to development.

camillomiller•5mo ago

There are so many use cases where 90% correct answers are absolutely not enough. Nobody would have much of a problem with that, if a flurry of people with vested interests wouldn't try to convince us all that that is not the case, and AI is good to go for absolutely everything. The absurdity of this assumption is so outrageous that it becomes even hard to counter it with logic. It's just a belief-based narrative whose delivery has been highly successful so far in commanding insane investments, and as a travestee for profit-oriented workforce optimizations.

mewpmewp2•5mo ago

Who is actually saying that AI is always 100 percent right?

There are disclaimers everywhere.

Sure there are usecases AI can't handle, but doesn't mean it is not massively valuable. There is not single thing in the World that can handle all usecases.

palmotea•5mo ago

> So, for MLE, working with AI that isn't always reliable, is a norm. They are accustomed to thinking in terms of probabilities, distributions, and acceptable levels of error. Applying this mindset to a coding assistant that might produce incorrect or unexpected code feels more natural. They might evaluate it like a model: "It gets the code right 80% of the time, saving me effort, and I can catch the 20%."

And given the current climate, the MLE's feel empowered for force their mindset onto others groups where it doesn't fit. I once heard a senior architect at my company ranting about that after a meeting: my employer sells products where accuracy and correctness have always been a huge selling point, and the ML people (in a different office) didn't seem to get that and thought 80-90% correct should be good enough for customers.

I'm reminded of the arguments about whether a 1% fatality rate for a pandemic disease was small or large. 1 is the smallest integer, but 1% of 300 million is 3 million people.

IanCal•5mo ago

This is where I find having a disconnect between an ML team and product team is so broken. Same for SE to be fair.

Accuracy rates, F1, anything, they're all just rough guides. The company cares about making money and some errors are much bigger than others.

We'd manually review changes for updates to our algos and models. Even with a golden set, breaking one case to fix five could be awesome or terrible.

I've given talks about this, my classic example is this somewhat imagined scenario (because it's unfair of me to accuse people of not checking at all):

It's 2015. You get an update to your classification model. Accuracy rates go up on a classic dataset, hooray! Let's deploy.

Your boss's, boss's, boss gets a call at 2am because you're in the news.

https://www.bbc.co.uk/news/technology-33347866

Ah. Turns out improving classifications of types of dogs improved but... that wasn't as important as this.

Issues and errors must be understood in context of the business. If your ML team is chucking models over the fence you're going to at best move slowly. At worst you're leaving yourself open to this kind of problem.

twak•5mo ago

i agree; but perhaps also it is the difference between managers and SWE? The former (SWE team leaders included) can see that engineers aren't perfect. The latter are often highly focused on determinism (this works/doesn't) and struggle with conflicting goals.

Through a career SWEs start rigid and overly focused on the immediate problem and become flexible/error-tolerant[1] as they become system (mechanical or meat) managers. this maps to an observation that managers like AI solutions - because they compare favourably to the new hire - and because they have the context to make this observation.

[1] https://grugbrain.dev/#:~:text=grug%20note%20humourous%20gra...

randysalami•5mo ago

I’m currently getting my masters in AI (lots of ML) and as a SWE, it’s definitely a new muscle I’m growing. At the same time, I can think about MLE in isolation and how it fits within the larger discipline of SWE. How I can build robust pipelines, integrate models into applications, deploy models within larger clusters, etc. I think there are many individuals which are pure MLE and lack the SWE perspective. Most critically, lots of ML people in my program aren’t computer people. They are math people or scientists first. They can grok the ML but grokking SWE without computer affinity is difficult. I see true full-stack being an understanding of low-level systems, back-front architecture, deployment, and now MLE. Just need to find someone who will compensate me for bringing all that to the table. Most postings are still either for SWE or PhD in MLE. Give me money!! I know it all

ryanackley•5mo ago

You're talking about deterministic behavior vs. probabilistic behavior and yes some discourse lines up with what you describe.

I don't think it's the case with this article. It focuses on the meta-concerns of people doing software engineering and how AI fits into that. I think he hits it on the head when he talks about Program Entropy.

A huge part of building a software product is managing entropy. Specifically, how you can add more code and more people while maintaining a reasonable forward velocity. More specifically, you have to maintain a system so you make it so all of those people understand how all the pieces fit together and how to add more of those pieces. Yes, I can see AI one day making this easier but right now, it oftentimes makes entropy worse.

Der_Einzige•5mo ago

Yup. All the people I've worked with through my career post 2020 (AI/ML types) have been AI first and AI native. They're the first users of cursor - a year before it went mainstream.

Sorry not sorry that the rest of the world has to look over their shoulders.

agumonkey•5mo ago

I often wonder if society will readjust its expectation of programs or even devices. Historically, machines of all kinds were difficult to design and manufacture.. the structure was hard set (hence the name) but at the same time, society fantasize about adaptive machines, hyper adaptive, multipurpose, context-aware.. which if pushed high, is not far from the noisy flexibility of ML.

ludicrousdispla•5mo ago

Based on my experience as an MLE I would never use any of the current 'AI' offerings, so whatever bias you are suggesting is very recent.

tom_m•5mo ago

Yea, the problem is that most people expect things to be absolute and correct outside of engineering.

I love the gray areas and probabilities and creativity of software...but not everyone does.

So the real danger is in everyone assuming the AI model is, must be, and always will be correct. They misunderstand the tool they are using (or directing others to use).

Hmm. It's like autopilot on the Tesla. You aren't supposed to take your hands off the wheel. You're supposed to pay attention. But people use it incorrectly. If they get into an accident, then people want to blame the machine. It's not. It's the fault of person who didn't read the instructions.

anyonecancode•5mo ago

I think you could make similar arguments about mapping technology like Google and Apple Maps -- that using them decreases people's skills in navigating the physical world, atrophying our sense of direction and geography.

And actually, that's not wrong. People really do often struggle to navigate these days if they don't have the crutch of something like Google Maps. It really has changed our relationship to the physical world in many ways.

But also, a lot of people weren't especially good at navigation before? The overall average ability of people being able to get from Point A to Point B safely and reliably, especially in areas they are unfamiliar with, has certainly increased dramatically. And a small subset of people who are naturally skilled at geography and navigation have seen their own abilities complemented, not replaces, by things like Google Maps.

I think AI will end up being similar, on a larger scale. Yes, there are definitely some trade offs, and some skills and abilities will decrease, but also many more people will be able to do work they previously couldn't, and a small number of people will get even better at what they do.

iamacyborg•5mo ago

> I think you could make similar arguments about mapping technology like Google and Apple Maps -- that using them decreases people's skills in navigating the physical world, atrophying our sense of direction and geography. And actually, that's not wrong. People really do often struggle to navigate these days if they don't have the crutch of something like Google Maps. It really has changed our relationship to the physical world in many ways

Entirely anecdotal but I have found the opposite. With this mapping software I can go walk in a random direction and confidently course correct as and when I need to, and once I’ve walked somewhere the path sticks in my memory very well.

macintux•5mo ago

I think you’re both right, but the key is walking vs driving. Walking gives you time to look around, GPS reduces stress, and typically you’re walking in an urban location with landmarks.

Driving still requires careful attention to other drivers, the world goes by rapidly, and most roads look like other roads.

klabb3•5mo ago

Also worth mentioning that tools have stable output. An LLM is not a tool in that sense – it’s not reproducible. Changing the model, retraining, input phrasing etc can change dramatically the output.

The best tools are transparent. They are efficient, fast and reliable, yes, but they’re also honest about what they do! You can do everything manually if you want, no magic, no hidden internal state, and with internal parts that can be broken up and tested in isolation.

With LLMs even the simple act of comparing them side by side (to decide which to use) is probabilistic and ultimately based partly on feelings. Perhaps it comes with the territory, but this makes me extremely reluctant to integrate it into engineering workflows. Even if they had amazing abilities, they lower the bar significantly from a process perspective.

viraptor•5mo ago

> An LLM is not a tool in that sense – it’s not reproducible.

LLMs are perfectly reproducible. Almost all public services providing them are not. The fact that changing the model changes the output doesn't make it not reproducible, in the same way reproducible software packages depend on a set version of the compiler. But you can run a local model with zero temperature, set starting conditions and you'll get the same response every time.

klabb3•5mo ago

I know it’s technically reproducible under the right conditions, and sure it might help in some cases. But it matters little in practice – the issue is that it’s unstable relative to unrelated parameters you often have good reason to change, often uninitentionally. For instance, you can and will get vastly different output based on usual non-semantic variations in language. I’m not a logician, but this is probably even a necessity given the ginormous output space LLMs operate on.

My point is that it’s not a tool, because good tools reliably work the same way. If, for instance, a gun clicks when it’s supposed to fire, we would say that it malfunctioned. Or it fires when the safety is on. We can define what should happen, and if something else happens, there is a fault.

omnimus•5mo ago

Me too. It's not like i ever used to have a map with me when i was in city i thought i knew. With map in my pocket i started to use it and memorized it much better. My model of the city is much stronger. For example i know proximate directions of neighborhoods i've never even visited.

davidclark•5mo ago

> The overall average ability of people being able to get from Point A to Point B safely and reliably, especially in areas they are unfamiliar with, has certainly increased dramatically.

Is there evidence for this?

bgwalter•5mo ago

No, I've never heard of someone getting into a unsafe situation because of using paper maps. When there were only paper maps, people managed.

Then there is this Google Maps accident:

https://www.independent.co.uk/tv/news/driver-bridge-google-m...

Which tells you that following directions of a computer makes people more stupid.

pixl97•5mo ago

>I've never heard of someone getting into a unsafe situation because of using paper maps

Simply because the media didn't report on it....

bombcar•5mo ago

The number of people willing to try has certainly increased.

My wife was very uncomfortable going to a new location via paper maps and directions. She’s perfectly happy following “bitching Betty” from the phone.

sensanaty•5mo ago

> I think you could make similar arguments about mapping technology like Google and Apple Maps

The problem is that mapping software is reliable and doesn't spit out a result of what is essentially a random number generator. You can rely on its output, the same way you can rely on a calculator. Not always, mind you, because mapping the entire globe is a massively complex task with countless caveats and edge cases, but compared to LLM output? Even with a temperature setting of 0 with the same prompt regenerated multiple times, you'll be getting vastly different output.

Also, since LLMs cover a much more broad swathe of concepts, people are going to be using these instead of their brains in a lot of situations where they really shouldn't. Even with maps, there are people out there that will drive into a lake because Google Maps told them that's where the street was, I can't even fathom the type of shit that's going to happen from people blindly trusting LLM output and supplanting all their thinking with LLM usage.

rpgraham84•5mo ago

>The problem is that mapping software is reliable and doesn't spit out a result of what is essentially a random number generator.

Actually, TSP is NP-hard (ie, at best, you never know whether you've been given the optimal route) in the general case, and Google maps might even give suboptimal routes intentionally sometimes, we don't know.

The problems you're describing are problems with people and they apply to every technology ever. Eg, people crash cars, blow up their houses by leaving the stove on, etc.

makeitdouble•5mo ago

> The problem is that mapping software is reliable and doesn't spit out a result of what is essentially a random number generator.

Not really.

I am not good at navigation yet love to walk around, so I use a set of maps apps a lot.

Google Maps is not reliable if you expect optimal routes, and its accuracy sharply falls if you're not traveling by car. Even then, bus lanes, prioerty lanes, time limited areas etc. will be a bloodbath if you expect Maps to understand them.

Mapping itself will often be inacurate in any town that isn't frozen in time for decades, place names are often wrong, and it has no concept of verticality/3D space, short of switching to limited experimental views.

Paid dedicated map apps will in general work a lot better (I'm thinking hiking maps etc.)

All to say, I'd mostly agree with parent on how fuzzier Maps are.

pjmlp•5mo ago

As someone that has been sent into barely usable mountain roads, militar compounds, or dried river beds multiple times in a couple of Mediterrean islands, I beg to differ with mapping software is reliable assertion.

dTal•5mo ago

>Even with a temperature setting of 0 with the same prompt regenerated multiple times, you'll be getting vastly different output.

Er, no?

ozgrakkurt•5mo ago

The reliability difference is so big that the analogy isn’t relevant.

Google maps is 90% of the time better than a taxi driver where I live.

AI isn’t better than some person that did the thing for a couple days

btbuildem•5mo ago

I strongly agree with both the premise of the article, and most of the specific arguments brought forth. That said, I've also been noticing some positive aspects of using LLMs in my day-to-day. For context, I've been in the software trade for about three decades now.

One thing working with AI-generated code forces you to do is to read code -- development becomes more a series of code reviews than a first-principles creative journey. I think this can be seen as beneficial for solo developers, as in a way, it mimics / helps learn responsibilities only present in teams.

Another: it quickly becomes clear that working with an LLM requires the dev to have a clearly defined and well structured hierarchical understanding of the problem. Trying to one-shot something substantial usually leads to that something being your foot. Approaching the problem from a design side, writing a detailed spec, then implementing sections of it -- this helps to define boundaries and interfaces for the conceptual building blocks.

I have more observations, but attention is scarce, so -- to conclude. We can look at LLMs as a powerful accelerant, helping junior devs grow into senior roles. With some guidance, these tools make apparent the progression of lessons the more experienced of us took time to learn. I don't think it's all doom and gloom. AI won't replace developers, and while it's incredibly disruptive at the moment, I think it will settle into a place among other tools (perhaps on a shelf all of its own).

doug_durham•5mo ago

I appreciate your nuanced position. I believe that any developer who isn't reading more code than they are writing is doing it wrong. Reading code is central to growth as a software engineer. You can argue that you'll be reading more bland code when reviewing code generated with the aid of an LLM. I still think you are learning. I've read lots of LLM generated code and I routinely learn new things. Idioms that I wasn't familiar with, or library calls I didn't know existed.

I also think that LLMs are an even more powerful accelerant for senior developers. We can prompt better because we know what exists and what to not bother trying.

Lazarus_Long•5mo ago

Problem is, paraphrasing Scott Kilmer, corporations are dead from the neck up. The conclusion for them was not that AI will help juniors, is that they will not hire juniors and will ask seniors the magic "10x" with the help of AI. Even some seniors are getting the boot, because AI.

Just look at recent news, layoff after layoff from Big Tech, Middle tech and small tech.

eikenberry•5mo ago

I don't think it is becoming a series of code reviews, more like having something do some prototyping for you. It is great for fixing the blank page problem, but not something you can review and commit as is.

btbuildem•5mo ago

In my experience code reviews involve a fair bit of back-and-forth, iterating with the would-be committer until the code 1) does what it's meant to and 2) does it in an acceptable manner. This parallels the common workflow of trying to get an LLM to produce something useable.

Havoc•5mo ago

>I don't think anyone believes that a computer program is literally their companion

That's what I thought. The AI girlfriend/boyfriend app things seem to suggest otherwise

I don't get it, but apparently others do

talkingtab•5mo ago

If you have a new unknown, tool, things go wrong in two ways. For example a hammer.

If it works, like to hit a nail, you end up smashing everything in sight. If it fails, like digging a garden, you end up thinking it is stupid.

But there is a third case.

You use it to do something that you did not know you could do before. Like to planish metal.

People are experiencing the first case and second.

CommenterPerson•5mo ago

How about we just call it "Advanced Algorithms" and drop the hype?

Minor quibble: On the chart at top, "Inverse correlation" would show up as a hyperbola. The line is more of a negative correlation. Just sayin' :-)

osigurdson•5mo ago

People would do better to think about new problems that can be solved with LLMs that couldn't have been solved by either humans or machines previously. People are overly focused on the tiny area of the Venn diagram where human and machine competency intersect.

baazaa•5mo ago

One thing that I've noticed is that AI has made it even more abundantly obvious that the low IQs of middle-managers are the main problem.

They have a great faith in AI (which is understandable), but they're constantly realising that:

a) they don't understand any of the problems enough to even being prompting for a solution

b) the AI can explain our code but the manager still won't understand

c) the AI can rephrase our explanations and they still won't understand.

Traditionally middle-managers probably consoled themselves with the idea that the nerds can't communicate well and coding is a dumb arcane discipline anyway. But now that their machine god isn't doing a better job than we are of ELI5ing it, I think even they're starting to doubt themselves.

jvanderbot•5mo ago

This subject has become such a Rorschach Blot. Some blame their middle managers, some blame their coworkers, some blame folks from this field or that. I'm guessing that's not a new thing to complain about, it's just a new reason.

mellosouls•5mo ago

The text in this essay was written without any use of AI

Stated with pride? Given the proofreading and critiquing abilities of AI this dubious boast is a useful signal for the head-in-sand arguments of the essay.

AI is a profound step change in our workflow abilities, a revolution already. Wise to be wary of it, but this reads as shouting at clouds.

The landscape has changed, we have to change with it.

hooverd•5mo ago

Use it or lose it. Soon you'll find you're unable to structure complex ideas without your friend computer.

viraptor•5mo ago

> it doesn't reason about ideas, diagrams, or requirements specifications. (...) How often have you witnessed an LLM reduce the complexity of a piece of code?

> Only humans can decrease or resist complexity.

It's funny how often there's a genuine concept behind posts like these, but then lots of specific claims are plainly false. This is trivial to do: ask for simpler code. I'm using that quite often to get a second opinion and get great results. If you don't query the model, you don't get any answer - neither complex or simple. If you query with default options, it's still a choice, not something inherent to the idea of LLM.

I'm also having a great time converting code into ideas and diagrams and vice versa. Why make the strong claims that people contradict in practice every day now?

_shadi•5mo ago

A big problem I keep facing when reviewing junior engineers code is not the code quality itself but the direction the solution went into, I'm not sure if LLM models are capable of replying to you with a question of why you want to do it that way(yes like the famous stackoverflow answers).

viraptor•5mo ago

They are definitely capable. Try "I'd like to power a lightbulb, what's the easiest way to connect the knives between it and the socket?" Which will start by saying it's a bad idea. My output also included:

> If you’re doing a DIY project Let me know what you're trying to achieve

Which is basically the SO style question you mentioned.

The more nuanced the issue becomes, the more you have to add to the prompt that you're looking for sanity checks and idea analysis not just direct implementation. But it's always possible.

dmohs•5mo ago

I kid you not, my face is wet from tears laughing at your example prompt. Thank you so much for making my day.

mewpmewp2•5mo ago

You can ask the why, but if it provides the wrong approach, just ask to make it what you want it to be. What is wrong with iteration?

I frequently have LLM write proposal.MD first and then iterate on that, then have the full solution, iterate on that.

It will be interesting to see if it does the proposal like I had in mind and many times it uses tech or ideas that I didn't know about myself, so I am constantly learning too.

_shadi•5mo ago

I might have not been clear in my original reply, I don't have this problem when using an LLM myself, I sometimes notice this when I review code by new joiners that was written with the help of an LLM, the code quality is usually ok unless I want to be pedantic, but sometimes the agent helper make new comers dig themselves deeper in the wrong approach while if they asked a human coworker they would probably have noticed that the solution is going the wrong way from the start, which touches on what the original article is about, I don't know if that is incompetence acceleration, but if used wrong or maybe not in a clear directed way, it can produce something that works but has monstrous unneeded complexity.

viraptor•5mo ago

We had the same worries about StackOverflow years ago. Juniors were going to start copying code from there without any understanding, with unnecessary complexity and without respect for existing project norms.

crazylogger•5mo ago

Nothing fundamentally prevents an LLM from achieving this. You can ask an LLM to produce a PR, another LLM to review a PR, and another LLM to critique the review, then another LLM to question the original issue's validity, and so on...

The reason LLM is such a big deal is that they are humanity's first tool that is general enough to support recursion (besides humans of course.) If you can use LLM, there's like a 99% chance you can program another LLM to use LLM in the same way as you:

People learn the hard way how to properly prompt an LLM agent product X to achieve results -> some company is going to encode these learnings in a system prompt -> we now get a new agent product Y that is capable of using X just like a human -> we no longer use X directly. Instead, we move up one level in the command chain, to use product Y instead. And this recursion goes on and on, until the world doesn't have any level left for us to go up to.

We are basically seeing this play out in realtime with coding agents in the past few months.

Lazarus_Long•5mo ago

"Nothing fundamentally prevents an LLM from achieving this"

Well yes, LLMs are not teleological, nor inventive.

crazylogger•5mo ago

What are you basing this on? Is there an “inventiveness test” that humans can pass but LLMs don’t? I’m not aware of any.

Lazarus_Long•5mo ago

I assume you ignored "teleology" because you concede the point, otherwise feel free to take it.

" Is there an “inventiveness test” that humans can pass but LLMs don’t?"

Of course, any topic where there is no training data available and that cannot be extrapolated by simply mixing the existing data. Of course that is harder to test on current unknowns and unknown unknowns.

But it is trivial to test on retrospective knowledge. Just train the AI with text say to the 1800 and see if it can come out with antibiotics and general relativity, or if it will simply repeat outdated notions of disease theory and newtonian gravity.

acquisitionsilk•5mo ago

Now there's a sensible point. Can someone do that, so that we can put some of these points definitely to bed?

crazylogger•5mo ago

I don't think it will settle things even if we did manage to train an 1800 LLM with sufficient size.

LLMs are blank slates (like an uncultured primitive human being - albeit LLM comes with knowledge built-in, but builtin knowledge is mostly irrelevant here). LLM output is purely a function of the input (context), so agentic systems' capabilities do not equal underlying LLM's capabilities.

If you ask such an LLM "overturn Newtonian physics, come up with a better theory", of course the LLM won't give you relativity just like that. The same way an uneducated human has no chance of coming up with relativity either.

However, ask it this:

``` You are Einstein ... <omitted: 10 million tokens establishing Einstein's early life and learnings> ... Recent experiments have put these ideas to doubt, ...<another bunch of tokens explaining the Michelson–Morley experiment>... Any idea why this occurs? ```

and provide it with tools to find books, speak with others, run experiments, etc. Conceivably, the result will be different.

Again, we pretty much see this play out in coding agents:

Claude the LLM has no prior knowledge of my codebase so of course it has zero chance of solving a bug in it. Claude 4 is a blank slate.

Claude Code the agentic system can:

- look at a screenshot.

- know what the overarching goal is from past interactions & various documentation it has generated about the codebase, as well as higher-level docs describing the company and products.

- realize the screenshot is showing a problem with the program.

- form hypothesis / ideate why the bug occurs.

- verify hypotheses by observing the world ("the world" to Claude Code is the codebase it lives in, so by "observing" I mean it reads the code).

- run experiments: modify code then run a type check or unit test (although usually the final observation is outsourced to me, so I am the AI's tool as much as the other way around.)

TeMPOraL•5mo ago

You'd think the whole "LLMs can't reason in concepts" meme would've died already. LLMs are literally concepts incarnate, this has already been demonstrated experimentally in many ways, not limited to figuring out how to identify and suppress or amplify specific concepts during inference.

Article also repeats some weird arguments that are superficially true, but don't stand to scrutiny. That Naur thing, which is a meme at this point, is often repeated as somehow insightful in the real world - yet what's forgotten is another fundamental, practical rule of software engineering: any nontrivial program quickly exceeds any one's ability to hold a full theory of it in their head; we almost never work with proper program theory; programming languages, techniques, methodologies and tools all evolve towards enabling people to work better without understanding most of the code. We actually share the same limitations as LLMs here, we're just better at managing it because we don't have to wait for anyone to let us do another inference loop so we can take a different perspective.

Etc.

dr_kiszonka•5mo ago

> I'm also having a great time converting code into [...] diagrams

Do you do it manually or a have automated tool? (I am looking for the latter.)

cadamsdotcom•5mo ago

It’s never been easier to be self-taught.

LLMs can explain the code they generate if you just ask - they never run out of patience. You can ask how it could be made faster, then ask why it did those specific things.

AI lets those with initiative shine so bright they show everyone else how it’s done.

abelanger•5mo ago

> program building is an entropy-decreasing process...program maintenance is an entropy-increasing process, and even its most skillful execution only delays the subsidence of the system into unfixable obsolescence

> Only humans can decrease or resist complexity.

For a simple program, maintenance is naturally entropy-increasing: you add an `if` statement for an edge case, and the total number of paths/states of your program increases, which increases entropy.

But in very large codebases, it's more fluid, and I think LLMs have the potentially to massively _reduce_ the complexity by recommending places where state or logic should be decoupled into a separate package (for example, calling a similar method in multiple places in the codebase). This is something that can be difficult to do "as a human" unless you happen to have worked in those packages recently and are cognizant of the pattern.

throwaway71271•5mo ago

I honestly think that tokens are inhuman, and actually it is harmful for humans to consume tokens.

In gpt2 times I used to read gpt generated text a lot, I was working on a game to guess if the text is AI generated or not, and for weeks while I was working on it I had strange dreams. It went away when I stopped consuming tokens, in gpt4 age this does not happen as I am reading hundreds of times more tokens than back then, but I think it is just more subtle.

Now I use AI to generate thousands of lines of code per day, at minimum sometimes now I just blank out when the AI doesnt spit out the code fast enough, I dont know what am I supposed to write, which libraries it is using what is the goal of this whole function etc, as it is not my code, it is foreign and I honestly dont want to be reading it at all.

This week I took the whole week off work and am just coding without AI and in few days the "blank out" is gone. Well, I did use AI to read 300 page docs of st7796s and write barebones spi driver for example, but I treat it almost as an external library, I give it the docs and example driver and it just works, but it is somewhat external to my thought process.

People argue that all fields have evolved, e.g. there are no more blacksmiths, but I argue that the machinists now are much more sophisticated than the ones in the past, pneumatic hammers allow them to work better and faster, as they use the hammer they get better understanding the material they work with, as in the machine does not take away their experience and ability to learn. I always had 2 days per week where I code without any AI, but now I think I have to change the way I code with it.

AI is for sure making me worse, and lazy. And I am not talking about the "career" here, I am talking about my ability to think.

I wrote few days ago about it: https://punkx.org/jackdoe/misery.html

clutter55561•5mo ago

A lot of the discourse around AI, and LLMs specifically, suffer terribly from FOMO and cognitive biases such as confirmation bias and anthropomorphism. The fact that AI/LLMs are commercial concerns makes it even more difficult to distinguish reality from bullshit.

I’m not a LLM user myself, but I’m slowly incorporating (forcing myself, really) AI into my workflow. I can see how AI as a tool might add value; not very different from, say, learning to touch-type or becoming proficient in Vim.

What is clear to me is that powerful tools lower entry barriers. Think Python vs C++. How many more people can be productive in the former vs the latter? It is also true that powerful tools lend themselves to potentially shitty products. C++ that is really shitty tends to break early, if it compiles at all, whereas Python is very forgiving. Intellisense is another such technology that lowers barriers.

Python itself is a good example of what LLMs can become. Python went from a super powerful tool in a jack-of-trades-master-of-none sort of way, to a rich data DSL driven by Numpy, Scipy, Pandas, Scikit, Jupyter, Torch, Matplotlib and many others; then it experienced another growth spurt with the advent of Rust tooling, and it is still improving with type checkers, free threading and even more stuff written in Rust - but towards correctness, not more power.

I really do hope that we can move past the current fomo/marketing/bullshit stage at some point, and focus on real and reproducible productivity gains.

stocknoob•5mo ago

I hope AI can one day confer a great benefit to mankind, perhaps at an award-winning level.

anthk•5mo ago

More than AI, learn Discrete Math and Number Theory. Calculus it's fine, but it's more suited for mechanics/system control/continous stuff. Sets and relations will weight far more on how to sort and classify your data in a really fast way. The Nyxt browser uses that really well, no AI needed, both for user search and data clustering.

https://nyxt.atlas.engineer/article/magic-search.org

https://nyxt.atlas.engineer/article/dbscan.org

dmos62•5mo ago

Power-accelerated imperfection. Your great-grandfather's combustion engine might run great on vegetable oil, but if you put vegetable oil in a race car, bad things will happen. The more powerful your tools, the more dangerous you are. But, that's the cynical way of looking at things. My prefered outlook is tools are great, and being an adult is great, and when you put these two things together: that's even better.

atleastoptimal•5mo ago

Why would the highest domains of critical thinking be out of reach for AI as models get bigger, better, and more robust? They've monopolized value-per-dollar for impulsive and low-medium horizon intellectual judgements, why wouldn't the depth of those judgements AI models can beat humans in not increase predictably, as they have so far over the past few years? Eventually all this tech debt and apparent unsolvable failures of AI will mean nothing once models are no longer inhibited by their growing pains, and can self-correct with an efficacy greater than humans.

globnomulous•5mo ago

> I don't think anyone believes that a computer program is literally their companion

Quoth the makers of Claude:

> AI systems are no longer just specialized research tools: they’re everyday academic companions.

> https://www.anthropic.com/news/anthropic-education-report-ho...

To call Anthropic's opener brazen, obnoxious, or euphemistic would be an understatement. I hope it ages like milk, as it deserves, and so embarrasses the company that it eventually requires a corrective foreword or retraction.

illiac786•5mo ago

I don’t think there’s a contradiction. Anthropic writes this because they want people to believe this, the author argues though that no one believes it.

I believe there’s a small minority of people that truly believes AI is a friend, but I would say it’s a psychological pathology.

I don’t bother trying to guess what large companies really think: a/ they’re made of so many different stakeholders I don’t think it’s possible. And b/ I know money is the most important thing if they are large enough and have lots of anonymous investors, I don’t need to know anything else.

ge96•5mo ago

> companion

replica ai

vladde•5mo ago

> It's still the wrong answer, because the question should have been, "How can I make this code thread-safe?" and whose answer is "Use System.Collections.Concurrent" and 1 line of code.

I often found myself adding "use built-in features if they exists", just because because of this type of scenario.

It unsettles me that some people feel okay always accepting AI code, even when it "works".

SirHumphrey•5mo ago

One of the easiest ways (for me) to spot AI code on a homework assignment has been to ask myself a simple question. Is this code something that would make sense for a human to write. It doesn't catch all cases - it doesn't catch even the majority of cases - but it does filter out people who just took the assignment and pasted it in to a prompt.

Mirroring the example provided in the article I once saw a 200 line class implementation for tridiagonal matrices in python - where a simple numpy command would suffice and perform an order of magnitude better.

esafak•5mo ago

AI lets you choose what you want to learn versus delegate. You don't have time to become great at everything.

photochemsyn•5mo ago

LLMs are great tools for assisting learning - but this is not an area where it's easy to extract large profits from users interested in learning. For example, if you want to understand the risks of floating point calculations given the limitations of computer hardware by analyzing how LAPACK and BLAS manage the problem, you can have an LLM write up a whole course syllabus on the subject and work your way through examples provided by the LLM with pretty good certainty that it's not hallucinating you into a corner. This is a well-studied topic, so one feels fairly confident that the LLM, properly prompted, is going to give you good information and if you cross-check with a textbook you won't be surprised.

In practice, I find this approach reduces productivity in favor of gaining a deeper understanding of how things work - instead of just naively using LAPACK/BLAS based libraries, one 'wastes time' diving into how they work internally, which previously would have been very opaque.

These are tools, it's up to you how you use them. Kind of like compilers, really.

Arch-TK•5mo ago

I think it's more of an issue where the term "engineering" means something very different in the software world than it does in the rest of the world but I find it a bit difficult to take any such article seriously when it makes statements like: "LLMs can't replace human engineering." as a response to "If you're a skilled, experienced engineer and you fear that AI will make you unemployable, adopt a more nuanced view."

If you are able to make such deductions then you should also be able to deduce that almost nobody employed in a "software engineering" role is doing any actual engineering.

The article assumes that software companies are hiring software engineers (where "engineer" actually means what it does everywhere else) when in reality most software companies are not hiring any kind of actual engineer.

And if engineers can't be replaced by AI, but you're not actually an engineer, can you be replaced by AI?

Now I don't know the answer for sure, but I'd say for most people in "software engineering" roles the answer is still no, at least for now. But the reasons definitely can't have anything to do with whether AI can do engineering.

As a final note: I'd recommend anyone in a software engineering role, who thinks they do "actual engineering", to actually study some actual engineering discipline. You most likely have the necessary level of intelligence to get started. But you will quickly find that electrical engineering, electronics engineering, mechanical engineering, structural engineering, drainage engineering, etc, are nothing like your actual day to day job in _very fundamental_ ways.

doug_durham•5mo ago

I really don't want to have that argument yet again. However I hold an engineering degree and a computer science degree. Software engineering is very much engineering. Engineering is the art of tradeoffs and balance. Using a lighter weight, but more expensive steel in a structure to place less load on the foundation is an example of a tradeoff in Structural engineering.

Once you get beyond the most simple code you are practicing tradeoff and balance. You can use a simple, memory intensive algorithm, but you need to understand if you have the space to use it. You might be able to develop software in 1/2 the time if you take a basic approach, but it won't scale.

I don't know if you develop software or not. Regardless think more deeply about what is involved in engineering.

Arch-TK•5mo ago

You've managed to describe adulthood, or just responsible decision making, not specifically software or engineering.

I notice you said you hold degrees. But which of these two disciplines do you actually work in?

doug_durham•5mo ago

You seem to have a peculiar notion of what makes up engineering. Data informed tradeoffs, math driven analysis, and deep knowledge on your field are keystones of engineering. These are also the keystones of Software Engineering which is what I practice.

doug_durham•5mo ago

The author seems to have an inflated notion of what developing software is about. Most software doesn't require "perspicacity". Think about Civil Engineering. In a very few cases you are designing a Golden Gate Bridge. The vast majority of the time you are designing yet another rural bridge over a culvert. The first case requires deep investigation into all of the factors involving materials strength, soil dynamics, ... In the second case your role as an engineer is to run the standard calculations so that it meets legal standards and isn't going to fall apart in a couple of years.

We all like to think the we are grand architects finely honing software for the ages, when in reality we are really just specifying that the right grade of gravel is being used in the roadbed so that you don't get potholes.

djtango•5mo ago

I don't disagree except that software is different to building bridges because once you build the bridge, it's a bridge and you're done.

Software engineering is like deciding after you built the bridge that it needs to now be able to open. Oh and the bridge is now busy and used heavily so we can't allow for any downtime while you rebuild the bridge to open.

And as much as we hope for standards so everything cookie cutter, there are idiosyncracies all over the place so no project is ever really the same.

These days I think of software more like writing fiction than engineering

doug_durham•5mo ago

Great points. The only reason that Civil Engineering isn't like software engineering is because of cost and legal regulations. It costs only labor to change software. If there were robotic workers and plentiful energy and materials you'd better believe that bridges would be getting refactored all of the time.

Also bridges are never done. They are continually inspected and refurbished. Every bridge you build has an on-going cost. Just like software.

culebron21•5mo ago

Totally agree, ...but why is the text formatted like ChatGPT output, with bullets and bold?

severusdd•5mo ago

LLMs are amazing at writing code and terrible at owning it.

Every line you accept without understanding is borrowed comprehension, which you’ll repay during maintenance with high interest. It feels like free velocity. But it's probably more like tech debt at ~40 % annual interest. As a tribe, we have to figure out how to use AI to automate typing and NOT thinking.

croes•5mo ago

Depending on the task they are terrible at writing code.

usrbinbash•5mo ago

> is borrowed comprehension,

Or would be, if the LLM actually understood what it was writing, using the same definition of understanding that applies to human engineers.

Which it doesn't, and by its very MO, cannot.

So, every line from an LLM that is accepted without understanding, is really nonexistent comprehension. It's a line of code, spat out by a stochastic model, and until some entity that actually can comprehend a codebases context, systems and designs (and currently the only known entity that can do that is a human being), it is un-comprehended.

ansgri•5mo ago

This is a very good analogy. And this interest rate can probably be significantly reduced by applying TDD and reducing the size of isolated subsystems. That may start to look like microservices. I generally don’t like both for traditional development, but current LLMs both make them easier and more useful.

And the “rule of three” basically ceases to be applicable between components — either the code has localized impact, or is a part of rock-solid foundational library. Intermediate cases just explode the refactoring complexity.

skybrian•5mo ago

In principle, mental constructs don’t need to remain in your head. They could be documented. But writing documentation, reading it, and maintaining it takes time.

With LLM assistance, it might become easier to maintain a Markdown file containing the “theory” of a program. But what should be in the file?

sawyna•5mo ago

I wish more people could understand this. Yes, LLMs help me learn faster, brainstorm ideas and so on, but to say that it can generate code and hence you can do complex things easily does not make sense.

For me, writing code has never ever been the challenge. Deciding what to write has always been the challenge.

UncleEntity•5mo ago

I went from writing zero code, because I'm lazy, to writing zero code because I get the robots to do it.

I have this folder of academic papers from when access was free during covid which is enough to keep me busy for quite a while. Usually I get caught up with the yak shaving and never really progress on whatever I was intending to work on but now I have this super efficient yak shaver so I can, umm, still get caught up with the yak shaving.

But, alas, shaving yaks and arguing with stupid robots makes me happy so...

bicepjai•5mo ago

Great write up. The writing style reminds me of my new favorite phrase “imitation and innovation” the author has right amount of quotes and comparable new thoughts. This keeps the reader engaged and not feel like how much more we have to go. I am 95% sure LLM cannot write like this, may be it can if the author wrote more pieces :)

_wire_•5mo ago

Per Michael Tolkien's writing the dialog to Robert Altman's movie The Player (1992):

(A movie studio executive who believes a screen writer has been harassing him with threatening postcards has just murdered the screenwriter the previous night. The executive arrives late at a mid-morning studio meeting as other executives argue about the lavishness of writers' fees)

Movie studio boss: Griffin, you're here! We were just discussing the significance of writing. Maybe you have something to add?

Executive: I was just thinking what an interesting concept it is to eliminate the writer from the artistic process. If we could just get rid of these actors and directors, maybe we've got something here.

(Assistant hands the executive another postcard)

—

Re AI chat:

High school students who are illiterate now use AI chat to orchestrate the concoction of course papers they can't understand, then instructors use AI chat to grade the work. The grads apply for jobs and get selected by AI chat, then interviewed by managers using AI chat. The lucky hires are woken up by their IOs and transported to an office at an unknown destination by autonomous cars. Their work is to follow scripts emitted by AI chat to apply AI chat to guide refinement of processes, further feeding LLMs. Once returned to the home cubicle after the shift, the specialist consumes the latest cartoons and sexts with an AI companion. The IO sounds the alarm and it's time to sleep.

If the MCP guys can just cut out the middle men, I think we've got something here!

—

The threat of the new machines is not AGI overlords who will exterminate an optional humanity. The threat is an old one that's been proven over millennia of history: the conversion of human beings into slaves.

tempodox•5mo ago

> LLMs give me finished thoughts, polished and convincing, but none of the intellectual growth that comes from developing them myself

Even if all the wonders were true that people love to believe in about LLMs, you cannot get around this argument.

croemer•5mo ago

This claim is talking out of proportion, LLMs do push back:

> Input Risk. An LLM does not challenge a prompt which is leading or whose assumptions are flawed or context is incomplete. Example: An engineer prompts, "Provide a thread-safe list implementation in C#" and receives 200 lines of flawless, correct code. It's still the wrong answer, because the question should have been, "How can I make this code thread-safe?" and whose answer is "Use System.Collections.Concurrent" and 1 line of code. The LLM is not able to recognize an instance of the XY problem because it was not asked to.

When I prompt Gemini 2.5 Pro with "Provide a thread-safe list implementation in C#" it _does_ push back and suggest using the standard library instead (in addition to providing the code of course). First paragraph of the LLM response:

> You can achieve a thread-safe list in C# by using the lock statement to synchronize access to a standard List<T>. Alternatively, you can use concurrent collection classes provided by .NET, such as ConcurrentBag<T> or ConcurrentQueue<T>, depending on the specific access patterns you need.

https://g.co/gemini/share/7ac7b9238b28

croemer•5mo ago

> LLMs as they currently exist cannot master a theory, design, or mental construct because they don't remember beyond their context window.

That's not categorically true: if a theory/design fits in their context window it's possible that they _can_ master it.

LLMs shine for simple tasks with little context. There are plenty of tasks like that.

croemer•5mo ago

> Only humans can decrease or resist complexity.

Proof needed

Ultra-HD televisions not noticeably better for typical viewer

Sketch Goes Liquid, Womp Gets Physical, and OpenAI Builds a Browser

How long can Microsoft expect investors to tolerate big losses from OpenAI?

Phonics Flashcards

Ask HN: "Bible" like books in any field

KDE Linux Deep Dive: Why We Don't Include Package Management

Inside An Isotemp Oven Controlled Crystal Oscillator

Coral cover is at its highest since monitoring began

Anna's Archive The largest open library in human history

We reduced a container image from 800GB to 2GB

Subvocalization: Toward Hearing the Inner Thoughts of Developers (2011) [pdf]

Tesla's Diner Isn't Competing with McDonald's. It's Going After Buc-Ee's

Trump, 79, Posts Deranged Medical Advice at 4 a.m

AI's Next Frontier? An Algorithm for Consciousness

Don't Forget These Tags to Make HTML Work Like You Expect

$10k Card Shufflers Hit the Spotlight After NBA Poker Scandal

Show HN: Write JavaScript code that Runs on RUST hits 1M req/s

Who's Hiring Senior Data Scientist

How the shutdown is about to get worse

What if hard work felt easier?

HP's Innovation Spirit and How It Died

Why open source may not survive the rise of generative AI

Prioritization Starts with Strategic Prioritization

AI and Software Engineering: This time it's different (it never is)

How Bad Is Finance's Cockroach Problem? We May Be About to Find Out

Victorian-Era Downtown Los Angeles

Liberación de Dispositivo

Polish top-performing language for complex AI tasks, finds study

Ungrounded Thought

Is Wikipedia's Jimmy Wales the last decent tech baron?

Ultra-HD televisions not noticeably better for typical viewer

Sketch Goes Liquid, Womp Gets Physical, and OpenAI Builds a Browser

How long can Microsoft expect investors to tolerate big losses from OpenAI?

Phonics Flashcards

Ask HN: "Bible" like books in any field

KDE Linux Deep Dive: Why We Don't Include Package Management

Inside An Isotemp Oven Controlled Crystal Oscillator

Coral cover is at its highest since monitoring began

Anna's Archive The largest open library in human history

We reduced a container image from 800GB to 2GB

Subvocalization: Toward Hearing the Inner Thoughts of Developers (2011) [pdf]

Tesla's Diner Isn't Competing with McDonald's. It's Going After Buc-Ee's

Trump, 79, Posts Deranged Medical Advice at 4 a.m

AI's Next Frontier? An Algorithm for Consciousness

Don't Forget These Tags to Make HTML Work Like You Expect

$10k Card Shufflers Hit the Spotlight After NBA Poker Scandal

Show HN: Write JavaScript code that Runs on RUST hits 1M req/s

Who's Hiring Senior Data Scientist

How the shutdown is about to get worse

What if hard work felt easier?

HP's Innovation Spirit and How It Died

Why open source may not survive the rise of generative AI

Prioritization Starts with Strategic Prioritization

AI and Software Engineering: This time it's different (it never is)

How Bad Is Finance's Cockroach Problem? We May Be About to Find Out

Victorian-Era Downtown Los Angeles

Liberación de Dispositivo

Polish top-performing language for complex AI tasks, finds study

Ungrounded Thought

Is Wikipedia's Jimmy Wales the last decent tech baron?

AI: Accelerated Incompetence

Comments