Trusting your own judgement on 'AI' is a risk

https://www.baldurbjarnason.com/2025/trusting-your-own-judgement-on-ai/

81•todsacerdoti•4h ago

Comments

whatevermom•3h ago

Wish more AI bros would read this.

rblatz•2h ago

Why? This is a trash article that basically says "I don't care how much more productive you claim to be with AI, you are a faulty narrator and falling into a physiological trap. Any reports or experience on success with AI should be completely ignored. Don't even look at AI until the bigger smarter science people evaluate it for us simple folk."

tptacek•2h ago

Fun note, someone on Bsky pointed out that this piece is a kind of sequel to an earlier piece by the same author titled (and I am not making this up) "React, Electron, and LLMs have a common purpose: the labour arbitrage theory of dev tool popularity."

flatline•3h ago

What's up with the typescript rant? Don't we have decades of research about the improvements in productivity and correctness brought by static type checking? It starts out strong but stuff like this really detracts from the overall point the article is trying to make, and it stands out as an incorrect and unjustified (and unjustifiable) point.

fhars•3h ago

Maybe they were alluding to the fact that typescript's type system is unsound?

tptacek•3h ago

They do not appear to be alluding to type system soundness.

zzzeek•3h ago

> Don't we have decades of research about the improvements in productivity and correctness brought by static type checking?

correct me if I'm wrong those studies would not have looked at TypeScript itself, which for all I know could be complete garbage designed to lock you into MSFT products.

kristianc•3h ago

It’s ironic as static typing exists precisely because developers don’t trust their instincts?

It’s a formal acknowledgement that humans make mistakes, implicit assumptions are dangerous and that code should be validated before it runs. That’s literally the whole point, and if developers by type were YOLO I’ll push it anyway, TS wouldn’t have got anywhere near the traction it has. Static typing is a monument to distrust.

sdlifer•2h ago

I used to think this way, but TypeScript allows you to autocomplete faster, which lets you develop faster.

diggan•2h ago

You mean like the autocomplete server/LSP shows you the results faster if you're doing TS than JS? Sounds like a weird editor quirk to bring up if so. In neovim the autocomplete is as fast in JS as any other language, not sure why it wouldn't be like that for you.

Dan42•2h ago

> Don't we have decades of research about the improvements in productivity and correctness brought by static type checking?

Yes, we have decades of such research, and the aggregate result of all those studies is that no productivity gain can be significantly demonstrated for static over dynamic, and vice-versa.

bazoom42•2h ago

Ate you referring to a specific review?

tgma•2h ago

Not sure what result you are referring to but in my experience, many of the academic research papers use “students” as test subjects. This is especially fucked up when you want to get Software Engineering results. Outside Google et al where you can get corporate sanctioned software engineering data at scale, I would be wary most academic results in the area could be garbage.

rblatz•2h ago

The whole thing is basically you can't know anything, and personal experience can't be trusted. So we are all helpless until "Science" tells us what to do. Until then sit on your hands.

Where is the proof that Javascript is a better language than Typescript? How do you know if you should be writing in Java/Python/C#/Rust/etc? Probably should wait to create your startup lest you fall into a psychological trap. That is the ultimate conclusion of this article.

It's ok to learn and experiment with things, and to build up your own understanding of the world based on your lived experiences. You need to be open minded and reevaluate your positions as more formalized understandings become available, but to say it's too dangerous to use AI because science hasn't tested anything is absurd.

rightbyte•1h ago

> Where is the proof that Javascript is a better language than Typescript?

This is an interesting question really. It feels like it would be really hard to do a study on that. I guess the strength of TS would show up mainly as program complexity grows such that you can't compare toy problems in student exams or what ever.

jpeloquin•2h ago

> Don't we have decades of research about the improvements in productivity and correctness brought by static type checking?

It seems messy. Just one example that I remember because it was on HN before: https://www.hillelwayne.com/post/this-is-how-science-happens...

agentultra•1h ago

As far as I know we don't have decades of research about the improvements in productivity and correctness brought by static type systems.

We have one study on test driven development. Another study that attempted to reproduce the results but found flaws in the original. Nothing conclusive.

The field of empirical research in software development practices is... woefully underfunded and incomplete. I think all we can say is, "more data needed."

hwayne did a talk on this [0].

If you ever try to read the literature it's spartan. We certainly haven't improved enough in recent years to make conclusions about the productivity of LLM-based coding tools. We have a study on CoPilot by Microsoft employees who studied Microsoft employees using it (Microsoft owns and develops CoPilot). There's another study that suggests CoPilot increases error rates in code bases by 41%.

What the author is getting at is that you can't rely on personal anecdotes and blog posts and social media influencers to understand the effects of AI on productivity.

If we want to know how it affects productivity we need to fund more and better studies.

[0] https://www.hillelwayne.com/talks/ese/

ramoz•3h ago

The gradient effect on this page - idk what you think you are doing, but I had to stop reading. Extremely discombobulating (idk how else to describe it). Using mobile.

ramoz•3h ago

The iOS safari reader makes it much better sorry for the rant.

GardenLetter27•3h ago

Typescript is a psychological hazard?

john-radio•2h ago

Welcome to the antimemetics division. This is not your first day.

croes•2h ago

That we are easily tricked was obvious when we switched from „don’t let FAANG get your data“ to „here is all my code and data so I can ask AI questions about it and let it rewrite it.“

sdlifer•2h ago

I wanted to like this, and there’s value in realizing that LLMs often generate a bunch of content that takes longer to understand just to modify it to do the right thing; so yes, it “lies” and that’s a hazard. But, I’ve been using LLMs a lot daily for the past several months, and they’re better than the article lets on.

The FUD to spread is not that AI is a psychological hazard, but that critical reasoning and training are much, much more important than they once were, it’s only going to get more difficult, and a large percentage of white-collar workers, artists and musicians will likely lose their jobs.

Animats•1h ago

> The FUD to spread is not that AI is a psychological hazard, but that critical reasoning and training are much, much more important than they once were.

Not sure which side of the argument this statement is promoting.

There must be something for which humans are essential. Right? Hello? Anybody? It's not looking good for new college graduates.[1]

[1] https://www.usatoday.com/story/money/2025/06/05/ai-replacing...

tptacek•1h ago

Why do you assume this is going to be particularly bad for new entrants and not for veterans?

Animats•1h ago

Because the entry level jobs are going away first.

tptacek•41m ago

Yes, but: why should that be the case? Entry-level programmers are inexpensive.

simonw•2h ago

This article starts with this section about how easily we can trick ourselves and ignore clear evidence of something:

> But Cialdini’s book was a turning point because it highlighted the very real limitations to human reasoning. No matter how smart you were, the mechanisms of your thinkings could easily be tricked in ways that completely bypassed your logical thinking and could insert ideas and trigger decisions that were not in your best interest.

The author is an outspoken AI skeptic, who then spends the rest of the article arguing that, despite clear evidence, LLMs are not a useful tool for software engineering.

I would encourage them to re-read the first half of their article and question if maybe they are falling victim to what it describes!

Baldur calls for scientific research to demonstrate if LLMs are useful programming productivity enhancements or not. I would hope that, if such research goes against their beliefs, they would chose to reassess.

(I'm not holding my breath with respect to good research: I've read a bunch of academic papers on software development productivity in the past and found most of them to be pretty disappointing: this field is notoriously difficult to measure.)

Beijinger•2h ago

Man, I have no idea of programming and I wrote a hackernews clone in one day with chatgpt.

vouaobrasil•2h ago

I think the question of whether LLMs are useful for software engineering is not the right question at all.

The better question should be whether long-term LLM use in software will make the overall software landscape better or worse. For example, LLM use could theoretically allow "better" software engineering by reducing bugs, making coding complex interfaces easier --- but in the long run, that could also increase complexity, making the overall user experience worse because everything is going to be rebuilt on more complex software/hardware infrastructures.

And, the top 10% of coder use of LLMs could also make their software better but make 90% of the bottom-end worse due to shoddy coding. Is that an acceptable trade-off?

The problem is, if we only look at one variable, or "software engineering efficiency" measured in some operational way, we ignore the grander effects on the ecosystem, which I think will be primarily negative due to the bottom 90% effect (what people actually use will be nightmarish, even if a few large programs can be improved).

simonw•2h ago

If we assume that LLMs will make the software ecosystem worse rather than better, I think we have two options:

1. Attempt to prevent LLMs from being used to write software. I can't begin to imagine how that would work at this point.

2. Figure out things we can do to try and ensure that the software ecosystem gets better rather than worse given the existence of these new tools.

I'm ready to invest my efforts in 2, personally.

vouaobrasil•2h ago

I would rather not play the prisoner's dilemma at all, and focus on 1 if possible. I don't code much but when I do code or create stuff, I do with without LLMs from scratch and at least some of my code is used in production :)

lumenwrites•2h ago

For a person so eager to psychoanalyze others, the author sure seems oblivious to his own biases.

dist-epoch•2h ago

Author in this article:

> Our only recourse as a field is the same as with naturopathy: scientific studies by impartial researchers. That takes time, which means we have a responsibility to hold off as research plays out, much like we do with promising drugs

Author in another article:

> Most of the hype is bullshit. AI is already full of grifters, cons, and snake oil salesmen, and it’s only going to get worse.

https://illusion.baldurbjarnason.com/

So I assume he has science research at hand to back up his claim that AI is full of grifters, cons, ... and that it will get worse.

colonCapitalDee•2h ago

I'm not going to wait for some scientist to tell me whether AI is useful or not. I'm going to use it myself and form my own opinion, I'm going to look at other people using it and listen to their opinions, and I'm going to follow scientific consensus once it forms. Sure, my opinion may be wrong. But that's the price you pay for having an opinion.

I also have to point out that the author's maligning of the now famous Cloudflare experiment is totally misguided.

"There are no controls or alternate experiments" -- there are tons and tons of implementations of the OAuth spec done without AI.

"We also have to take their (Cloudflare’s) word for it that this is actually code of an equal quality to what they’d get by another method." -- we do not. It was developed publicly in Github for a reason.

No, this was not a highly controlled lab experiment. It does not settle the issue once and for all. But it is an excellent case study, and a strong piece of evidence that AI is actually useful, and discarding it based on bad vibes is just dumb. You could discard it for other reasons! Perhaps after a more thorough review, we will discover that the implementation was actually full of bugs. That would be a strong piece of evidence that AI is less useful than we thought. Or maybe you concede that AI was useful in this specific instance, but still think that for development where there isn't a clearly defined spec AI is much less useful. Or maybe AI was only useful because the engineer guiding it was highly skilled, and anything a highly skilled engineer works on is likely to be pretty good. But just throwing the whole thing out because it doesn't meet your personal definition of scientific rigor is not useful.

I do hear where the author is coming from on the psychological dangers of AI, but the author's preferred solution of "simply do not use it" is not what I'm going to do. It would be more useful if instead of fearmongering, the author gave concrete examples of the psychological dangers of AI. A controlled experiment would be best of course, but I'd take a Cloudflare style case study too. And if that evidence can not be provided, then perhaps the psychological danger of AI is overstated?

fungiblecog•1h ago

Whether you think LLM's in coding are good or bad largely depends on what you think of current software dev practice. He only gets to this towards the end of the article but this is the main source of personal bias.

If you think the shoddy code currently put into production is fine you're likely to view LLM generated code as miraculous.

If you think that we should stop reinventing variations on the same shoddy code over and over - and instead find ways of reusing existing solid code and generally improving quality (this was the promise of Object Orientation back in the nineties which now looks laughable) then you'll think LLM's are a cynical way to speed up the throughput of garbage code while employing fewer crappy programmers.

disambiguation•51m ago

Open question to HN: In your opinion, what product or piece of work best represents the high-watermark of AI achievement - LLMs or otherwise? I find articles like this are less viable in the face of real counter example. I see a few comments already challenging the author for downplaying the CloudFlare OAuth project - is that repo the current champion of SOTA LLMs?

tptacek•8m ago

People will disagree but I think this is a category error. If you're looking for a shining example of sharp, crystallized code to stack up against Fabrice Bellard, you're not going to find it, because that's not what LLM agents do.

'kentonv said this best on another thread:

It's not the typing itself that constrains, it's the detailed but non-essential decision-making. Every line of code requires making several decisions, like naming variables, deciding basic structure, etc. Many of these fine-grained decisions are obvious or don't matter, but it's still mentally taxing" [... they go on from here].

(Thread: https://news.ycombinator.com/item?id=44209249).

What does that look like on a scoreboard? I guess you'll have to wait a while. Most things that most people write, even when they're successful, aren't notable as code artifacts. A small fraction of successful projects do get that kind of notability; at some point in the next couple years, a couple of them will likely be significantly LLM-assisted, just because it's a really effective way of working. But the sparkliest bits of code in those projects are still likely to be mostly human.

Scientists Grow Human Teeth in a Lab

Show HN: I made a mobile app that turns your step count into a race

Brain Medicine - The calamity of a plastic spoon in your brain (micro plastics)

Tobacco CEOs testify: "Nicotine is not addictive" (1994)

Show HN: RenderDay: A GPU-only render farm for Blender

Triangulate – Turn-based triangle drawing game for 2 to 4 players

Containerization is a Swift package for running Linux containers on macOS

WWDC 25 Keynote Thoughts

Why Icon Rebranded to Sodax and Abandoned Its Layer-1

Apple: AI suffers "complete accuracy collapse" in face of complex problems

Omni-Path is back on the AI, HPC menu in a new challenge to Nvidia's InfiniBand

I Want to Hack to See WHO He Likes

Cal Newport: What Isaac Asimov Reveals About Living with A.I.

Baking the Y Combinator from scratch, Part 2: Recursion and its consequences

Take That, You Hockey Puck

You Can Be Anything [video]

Agentic AI Summit 8/2 UC Berkeley-Early Bird Registration Now Open Until 6/30

Container: Apple's Linux-Container Runtime

The urgency of saving our wallets

The Secret World of Luxury Real Estate in DC

Headspace competitor Nomadful expands worldwide on the App Store

Bears, mice, and moles aren't enough: a better approach for preventing fraud

Push Science [pdf]

New revolutionary device – 24 EEG channels with Raspberry Pi

Domains I Love

Lovart AI-Powered Visual Design Assistant

Show HN: ForkOff – Live Fork and RPC Desync Monitor for Ethereum and L2s

As war rages in Gaza, archaeological looting in the West Bank has spiked

Pavel Durov Speaks Out Since His Arrest in France [video]

Xcode 26 Beta Release Notes

Scientists Grow Human Teeth in a Lab

Show HN: I made a mobile app that turns your step count into a race

Brain Medicine - The calamity of a plastic spoon in your brain (micro plastics)

Tobacco CEOs testify: "Nicotine is not addictive" (1994)

Show HN: RenderDay: A GPU-only render farm for Blender

Triangulate – Turn-based triangle drawing game for 2 to 4 players

Containerization is a Swift package for running Linux containers on macOS

WWDC 25 Keynote Thoughts

Why Icon Rebranded to Sodax and Abandoned Its Layer-1

Apple: AI suffers "complete accuracy collapse" in face of complex problems

Omni-Path is back on the AI, HPC menu in a new challenge to Nvidia's InfiniBand

I Want to Hack to See WHO He Likes

Cal Newport: What Isaac Asimov Reveals About Living with A.I.

Baking the Y Combinator from scratch, Part 2: Recursion and its consequences

Take That, You Hockey Puck

You Can Be Anything [video]

Agentic AI Summit 8/2 UC Berkeley-Early Bird Registration Now Open Until 6/30

Container: Apple's Linux-Container Runtime

The urgency of saving our wallets

The Secret World of Luxury Real Estate in DC

Headspace competitor Nomadful expands worldwide on the App Store

Bears, mice, and moles aren't enough: a better approach for preventing fraud

Push Science [pdf]

New revolutionary device – 24 EEG channels with Raspberry Pi

Domains I Love

Lovart AI-Powered Visual Design Assistant

Show HN: ForkOff – Live Fork and RPC Desync Monitor for Ethereum and L2s

As war rages in Gaza, archaeological looting in the West Bank has spiked

Pavel Durov Speaks Out Since His Arrest in France [video]

Xcode 26 Beta Release Notes

Trusting your own judgement on 'AI' is a risk

Comments