frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Beginning January 2026, all ACM publications will be made open access

https://dl.acm.org/openaccess
389•Kerrick•1h ago•41 comments

Classical statues were not painted horribly

https://worksinprogress.co/issue/were-classical-statues-painted-horribly/
306•bensouthwood•4h ago•166 comments

Your job is to deliver code you have proven to work

https://simonwillison.net/2025/Dec/18/code-proven-to-work/
219•simonw•2h ago•191 comments

Virtualizing Nvidia HGX B200 GPUs with Open Source

https://www.ubicloud.com/blog/virtualizing-nvidia-hgx-b200-gpus-with-open-source
63•ben_s•3h ago•15 comments

Launch HN: Pulse (YC S24) – Production-grade unstructured document extraction

19•sidmanchkanti21•1h ago•5 comments

Are Apple gift cards safe to redeem?

https://daringfireball.net/linked/2025/12/17/are-apple-gift-cards-safe-to-redeem
236•tosh•2h ago•185 comments

Using TypeScript to Obtain One of the Rarest License Plates

https://www.jack.bio/blog/licenseplate
81•lafond•2h ago•65 comments

Jonathan Blow has spent the past decade designing 1,400 puzzles for you

https://arstechnica.com/gaming/2025/12/jonathan-blow-has-spent-the-past-decade-designing-1400-puz...
210•furcyd•6d ago•276 comments

Please Just Try Htmx

http://pleasejusttryhtmx.com/
172•iNic•2h ago•168 comments

Creating apps like Signal could be 'hostile activity' claims UK watchdog

https://www.techradar.com/vpn/vpn-privacy-security/creating-apps-like-signal-or-whatsapp-could-be...
298•donohoe•5h ago•200 comments

RCE via ND6 Router Advertisements in FreeBSD

https://www.freebsd.org/security/advisories/FreeBSD-SA-25:12.rtsold.asc
101•weeha•8h ago•56 comments

Microscopic robots that sense, think, act, and compute

https://www.science.org/doi/10.1126/scirobotics.adu8009
5•XzetaU8•4d ago•0 comments

Slowness is a virtue

https://blog.jakobschwichtenberg.com/p/slowness-is-a-virtue
178•jakobgreenfeld•6h ago•68 comments

Dogalog: A realtime Prolog-based livecoding music environment

https://github.com/danja/dogalog
18•triska•4d ago•3 comments

Gemini 3 Flash: Frontier intelligence built for speed

https://blog.google/products/gemini/gemini-3-flash/
1072•meetpateltech•1d ago•564 comments

Hightouch (YC S19) Is Hiring

https://hightouch.com/careers
1•joshwget•5h ago

Egyptian Hieroglyphs: Lesson 1

https://www.egyptianhieroglyphs.net/egyptian-hieroglyphs/lesson-1/
129•jameslk•11h ago•51 comments

I got hacked: My Hetzner server started mining Monero

https://blog.jakesaunders.dev/my-server-started-mining-monero-this-morning/
534•jakelsaunders94•19h ago•328 comments

Show HN: A local-first memory store for LLM agents (SQLite)

https://github.com/CaviraOSS/OpenMemory
29•nullure•4d ago•14 comments

What is an elliptic curve? (2019)

https://www.johndcook.com/blog/2019/02/21/what-is-an-elliptic-curve/
118•tzury•10h ago•12 comments

After ruining a treasured water resource, Iran is drying up

https://e360.yale.edu/features/iran-water-drought-dams-qanats
264•YaleE360•6h ago•214 comments

It's all about momentum

https://combo.cc/posts/its-all-about-momentum-innit/
93•sph•7h ago•32 comments

systemd v259 Released

https://github.com/systemd/systemd/releases/tag/v259
39•voxadam•2h ago•16 comments

Heart and Kidney Diseases and Type 2 Diabetes May Be One Ailment

https://www.scientificamerican.com/article/heart-and-kidney-diseases-plus-type-2-diabetes-may-be-...
30•Brajeshwar•1h ago•10 comments

From profiling to kernel patch: the journey to an eBPF performance fix

https://rovarma.com/articles/from-profiling-to-kernel-patch-the-journey-to-an-ebpf-performance-fix/
23•todsacerdoti•4d ago•1 comments

Most parked domains now serving malicious content

https://krebsonsecurity.com/2025/12/most-parked-domains-now-serving-malicious-content/
94•bookofjoe•4h ago•28 comments

The Big City; Save the Flophouses (1996)

https://www.nytimes.com/1996/01/14/magazine/the-big-city-save-the-flophouses.html
30•ChadNauseam•3d ago•10 comments

AI helps ship faster but it produces 1.7× more bugs

https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation-report
99•birdculture•4h ago•110 comments

Working quickly is more important than it seems (2015)

https://jsomers.net/blog/speed-matters
231•bschne•3d ago•110 comments

Spain fines Airbnb €65M: Why the government is cracking down on illegal rentals

https://www.euronews.com/travel/2025/12/15/spain-fines-airbnb-65-million-why-the-government-is-cr...
89•robtherobber•2h ago•87 comments
Open in hackernews

AI helps ship faster but it produces 1.7× more bugs

https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation-report
98•birdculture•4h ago

Comments

bogzz•3h ago
oh wow, an LLM-based company with an article that claims AI is oddly not as bad when it comes to generating gobbledegook as everyday empirical evidence should suggest
jjmarr•3h ago
Coderabbit is an LLM code review company so their incentives are the opposite. AI is terrible and you need more AI to review it.

fwiw, I agree. LLM-powered code review is a lifesaver. I don't use Coderabbit but all of my PRs go through Copilot before another human looks at it. It's almost always right.

bpicolo•3h ago
Their incentives are perfectly aligned - you’re making more bugs, surely you need some AI code review to help prevent that.

It’s literally right at the end of their recommendations list in the article

jjmarr•3h ago
The original comment said:

> an article that claims AI is oddly not as bad when it comes to generating gobbledegook

Ironically, Coderabbit wants you to believe AI is worse at generating gobbledegook.

GoatInGrey•16m ago
Make the gobbledygook from your gobbledygook generator better with our proprietary gobbledygook generator.

I'm obviously taking the piss here, but the irony is amusing.

elktown•2h ago
You're comment history suggests a pro-AI bias on par with AI companies. I don't understand it. It seems like critical thinking, nuance, and just basic caution have been turned off like a light-switch for far too many people.
naasking•1h ago
Our industry never exhibited an abundance of caution, but if you have trouble understanding the value of AI here, consider that you are akin to an assembly language programmer in the 1970s or 80s who couldn't understand why people are so gung-ho about these compilers that just output worse code than they could write by hand. In retrospect, compilers only got better and better, and familiarity with programming languages and compilation toolchains became a valuable productivity skill and the market for assembly language programming either stagnated, or shrank.

Doesn't it seem plausible to you that, whatever the ratio of bugs in AI-generated code today, that bug count is only going to really go down? Doesn't it then seem reasonable to say that programmers should start familiarizing themselves with these new tools, where the pitfalls are and how to avoid them?

bogzz•1h ago
compilers aren't probabilistic models though
naasking•1h ago
Successful compiler optimizations are probabilistic though, from the programmer's point of view. LLMs are internally deterministic too.
miningape•1h ago
What? Do you even know how compilers work?
naasking•1h ago
Are you able to predict with 100% accuracy when a loop will successfully unroll, or various interprocedural or intraprocedural analyses will succeed? They are applied deterministically inside a compiler, but often based on heuristics, and the complex interplay of optimizations in complex programs means that sometimes they will not do what you expect them to do. Sometimes they work better than expected, and sometimes worse. Sounds familiar...
miningape•59m ago
> Are you able to predict with 100% accuracy when a loop will successfully unroll, or various interprocedural or intraprocedural analyses will succeed?

Yes, because:

> They are applied deterministically inside a compiler

Sorry, but an LLM randomly generating the next token isn't even comparable.

Deterministic complexity =/= randomness.

naasking•56m ago
> Yes, because:

Unless you wrote the compiler, you are 100% full of it. Even as the compiler writer you'd be wrong sometimes.

> Deterministic complexity =/= randomness.

LLMs are also deterministically complex, not random.

miningape•52m ago
> Unless you wrote the compiler, you are 100% full of it. Even then you'd be wrong sometimes

You can check the source code? What's hard to understand? If you find it compiled something wrong, you can walk backwards through the code, if you want to find out what it'll do walk forwards. LLMs have no such capability.

Sure maybe you're limited by your personal knowledge on the compiler chain, but again complexity =/= randomness.

For the same source code, and compiler version (+ flags) you get the exact same output every time. The same cannot be said of LLMs, because they use randomness (temperature).

> LLMs are also deterministically complex, not random

What exactly is the temperature setting in your LLM doing then? If you'd like to argue pseudorandom generators our computers are using aren't random - fine, I agree. But for all practical purposes they're random, especially when you don't control the seed.

naasking•18m ago
> If you find it compiled something wrong, you can walk backwards through the code, if you want to find out what it'll do walk forwards. LLMs have no such capability.

Right, so you agree that optimization outputs not fully predictable in complex programs, and what you're actually objecting to is that LLMs aren't like compiler optimizations in the specific ways you care about, and somehow this is supposed to invalidate my argument that they are alike in the specific ways that I outlined.

I'm not interested in litigating the minutiae of this point, programmers who treat the compiler as a black box (ie. 99% of them) see probabilistic outputs. The outputs are generally reliable according to certain criteria, but unpredictable.

LLM models are also typically probabilistic black boxes. The outputs are also unpredictable, but also somewhat reliable according to certain criteria that you can learn through use. Where the unreliability is problematic you can often make up for their pitfalls. The need for this is dropping year over year, just as the need for assembly programming to eke out performance dropped year over year of compiler development. Whether LLMs will become as reliable as compiler optimizations remains to be seen.

mwigdahl•1h ago
True. The question is whether that's relevant to the trajectory described or not.
elktown•1h ago
If I have a horse and plow and you show up with a tractor, I will no doubt get a tractor asap. But if you show up with novel amphetamines for you and your horse and scream "Look how productive I am! We'll figure out the long-term downsides, don't you worry! Just more amphetamines probably!", I'm happy to be a late adopter.
naasking•1h ago
A tractor based on a Model T wouldn't have been very compelling either at the time. Not many horse-drawn plows these days though.
elktown•8m ago
I understand that you've convinced yourself that progress is inevitable. I'll ponder over it on my commute to Mars. Oh wait, that was still on the tele.
azemetre•1h ago
No because programmers aren't the ones pushing the wares, it's business magnates and sales people. The two core groups software developers should never trust.

Maybe if this LLM craze was being pushed by democratic groups where citizens are allowed to state their objections to such system, where such objections are taken seriously, but what we currently have are business magnates that just want to get richer with no democratic controls.

naasking•1h ago
> No because programmers aren't the ones pushing the wares, it's business magnates and sales people.

This is not correct, plenty of programmers are seeing value in these systems and use them regularly. I'm not really sure what's undemocratic about what's going on, but that seems beside the point, we're presumably mostly programmers here talking about the technical merits and downsides of an emerging tech.

NeutralCrane•52m ago
This seems like an overly reductive worldview. Do you really think there isn't genuine interest in LLM tools among developers? I absolutely agree there are people pushing AI in places where it is unneeded, but I have not found software development to be one of those areas. There are lots of people experimenting and hacking with LLMs because of genuine interest and perceived value.

At my company, there is absolutely no mandate for use of AI tooling, but we have a very large number of engineers who are using AI tools enthusiastically simply because they want to. In my anecdotal experience those who do tend to be much better engineers than the ones who are most skeptical or anti-AI (though its very hard to separate how much of this is the AI tooling, and how much is that naturally curious engineers looking for new ways to improve inevitably become better engineers who don't).

The broader point is, I think you are limiting yourself when you immediately reduce AI to snake oil being sold by "business magnates". There is surely a lot of hype that will die out eventually, but there is also a lot of potential there that you guarantee you will miss out on when you dismiss it out of hand.

azemetre•15m ago
I use AI every day and run my own local models, that has nothing to do with seeing sales people acting like sales people or conmen being con artists.

Also add in the fact that big tech has been extremely damaging to western society for the last 20 years, there's really little reason to trust them. Especially since we see how they treat those with different opinions than them (trying to force them out of power, ostracize them publicly, or in some cases straight up poisoning people + giving them cancer).

Not really hard to see how people can be against such actions? Well buckle up bro, come post 2028 expect a massive crackdown and regulations against big tech. It's been boiling for quite a while and there's trillions of dollars to plunder for the public's benefit.

gldrk•50m ago
High-level languages were absolutely indispensable at a time when every hardware vendor had its own bespoke instruction set.

If you only ever target one platform, you might as well do it in assembly, it's just unfashionable. I don't believe you'd lose any 'productivity' compared to e.g. C, assuming equal amounts of experience.

naasking•15m ago
> I don't believe you'd lose any 'productivity' compared to e.g. C, assuming equal amounts of experience.

I'm skeptical, but do you think that you'd see no productivity gains for Python, Java or Haskell?

gldrk•5m ago
Those are garbage-collected environments. I have some experience with a garbage-collected 'assembly' (.NET CIL). It is a delight to read and write compared to most C code.
saulpw•15m ago
Type checking, even that as trivial as C's, is a boon to productivity, especially on large teams but also when coding solo if you have anything else in your brain.
NeutralCrane•1h ago
> It seems like critical thinking, nuance, and just basic caution have been turned off like a light-switch for far too many people.

Ironically, this response contains no critical thinking or nuance.

elktown•54m ago
Such a typical HN "gotcha!".
NeutralCrane•43m ago
I recommend engaging with ideas next time, rather than making reductive, ad-hominem, thought-terminating statements.
elktown•27m ago
Thanks! I recommend not reading all comments literally. We have a significant hype bubble atm and I'm not exactly alone in thinking how crazy it is. I think you can draw a connection from my exasperated statement to that if you really wanted to.
XenophileJKO•32m ago
They're not wrong. I think many people also saw/see the trajectory of the models.

If you were pro-ai doing the majority of coding a year ago, you would have been optimistically in front of where the tech was actually capable.

If you are strongly against AI doing the majority of coding now, you are likely well behind what the current tech is capable of.

People who were pragmatic and knowledgeable anticipated this rise in capability.

GoatInGrey•23m ago
My operating assumption, for everyone acting the way you described, is that it's predicated on the belief of "I have an opportunity to make money from this." It is exceedingly rare to find an instance of someone using the tech purely for the love of the game who isn't also tying it back to income generation in some way.
asdfdfd•7m ago
it's called a love of money
tyleo•3h ago
I have a theory that vibe coding existed before AI.

I’ve worked with plenty of developers who are happy to slam null checks everywhere to solve NREs with no thought to why the object is null, should it even be null here, etc. There’s just a vibe that the null check works and solves the problem at hand.

I actually think a few folks like this can be valuable around the edges of software but whole systems built like this are a nightmare to work on. IMO AI vibe coding is an accelerant on this style of not knowing why something works but seeing what you want on the screen.

jmkni•3h ago
Blindly copying and pasting from StackOverflow until it kinda sorta works is basically vibe coding

AI just automates that

giantg2•2h ago
Yeah, but you had to integrate it until it at least compiled, which kind of made people think about what they're pasting.

I had a peer who suddenly started completing more stories for a month or two when our output was largely equal before. They got promoted over me. I reviewed one of their PRs... what a mess. They were supposed to implement caching. Their first attempt created the cache but never stored anything in it. Their next attempt stored the data in the cache, but never looked at the cache - always retrieving from the API. They deleted that PR to hide their incompetence and opened a new one that was finally right. He was just blindly using AI to crank out his stories.

That team had something like 40% of capacity being spent on tech debt, rework, and bug fixes. The leadership wanted speed above all else. They even tried to fire me because they thought I was slow, even though I was doing as much or more work than my peers.

skydhash•1h ago
> Yeah, but you had to integrate it until it at least compiled, which kind of made people think about what they're pasting

That’s a very low bar. It’s easy to get a program to compile. And if it’s interpreted, you can coast for months with no crashes, just corrupted state.

The issue is not that they can’t code, it’s that they can’t problem solve and can’t design.

giantg2•1h ago
Yeah, but integrating manually is more likely to force them to think than if the agent just does everything. You used to have to search stackoverflow, which requires articulating the problem. Now you can just tell copilot to fix it.
PaulHoule•40m ago
It's a frustrating situation. I had a stretch in my career when I was the clean up person who did the 90% of work that was left after management thought a junior had gotten in 90% done. It's potentially very satisfying but very easy to feel unappreciated in (e.g. they wish the junior could have gotten it done and thought I was "too slow" though in retrospect one year of that was an annus mirabilis where I completed an almost unbelievable number of diverse projects.)
dionian•21m ago
To be fair my AI setup almost always compiles before thinking its done.
zipy124•3h ago
I agree but I'd draw a different comparison. That is vibe coding has accelerated the type of developers who relied on stack overflow to solve all their problems. The kind of dev who doesn't try to solve problems themselves. It has just accelerated this type of working, but is less reliable than before.
skeeter2020•1h ago
this matches with my first thought of this "study" (remember what coderabbit sells...); can you compare these types of PRs directly? Is the conclusion that AI produces more bugs, or is that a symptom of something else, like AI PRs are produced by less experienced developers?
whynotmaybe•3h ago
"on error resume next" has been the first line of many vba scripts for years
eterm•2h ago
I caught claude trying to sneak in the equivalent to a CI script yesterday as I was wrangling how to run framework and dotnet tests next to each other without slowing down the framework tests horrendously.

It tried to sneak in changing the CI build script to proceed to next step on failure.

It's a bold approach, I'll give it that.

skeeter2020•1h ago

  1. if it won't compile you'll give up on the tool in minutes or an hour.
  2. if it won't run you'll give up in a few hours or a day.
  3. if it sneaks in something you don't find until you're almost - or already - in production it's too late.
charitable: the model was trained on a lot of weak/lazy code product.

less-charitable: there's a vested interest in the approach you saw.

andy99•18m ago
Yeah it’s trained to do that somewhere though it’s not necessary malicious. For RLHF (the model fine tuning) the HF stands for human feedback but is really another trained model that’s trained to score replies the way a human would. And so if that model likes code that passes tests more than code that’s stuck in a debugging loop, that’s what the model becomes optimized for.

In a complex model like Claude there is no doubt much more at work, but some version of optimizing for the wrong thing is what’s ultimately at play.

eurekin•3h ago
"ship fast, break things"
palmotea•2h ago
> I actually think a few folks like this can be valuable around the edges of software but whole systems built like this are a nightmare to work on. IMO AI vibe coding is an accelerant on this style of not knowing why something works but seeing what you want on the screen.

I would correct that: it's not an accelerant of "seeing what you want on the screen," it's an accelerant of "seeing something on the screen."

[Hey guys, that's a non-LLM it's not X, it's Y!]

Things like habitual, unthoughtful null-checks are a recipe for subtle data errors that are extremely hard to fix because they only get noticed far away (in time and space) from the actual root cause.

jerf•2h ago
One of my frustrations with AI, and one of the reasons I've settled into a tab-complete based usage of it for a lot of things, is precisely that the style of code it uses in the language I'm using puts out a lot of things I consider errors based on the "middle-of-the-road" code style that it has picked up from all the code it has ingested. For instance, I use a policy of "if you don't create invalid data, you won't have to deal with invalid data" [1], but I have to fight the AI on that all the time because it is a routine mistake programmers make and it makes the same mistake repeatedly. I have to fight the AI to properly create types [2] because it just wants to slam everything out as base strings and integers, and inline all manipulations on the spot (repeatedly, if necessary) rather than define methods... at all, let alone correctly use methods to maintain invariants. (I've seen it make methods on some occasions. I've never seen it correctly define invariants with methods.)

Using tab complete gives me the chance to generate a few lines of a solution, then stop it, correct the architectural mistakes it is making, and then move on.

To AI's credit, once corrected, it is reasonably good at using the correct approach. I would like to be able to prompt the tab completion better, and the IDEs could stand to feed the tab completion code more information from the LSP about available methods and their arguments and such, but that's a transient feature issue rather than a fundamental problem. Which is also a reason I fight the AI on this matter rather than just sitting back: In the end, AI benefits from well-organized code too. They are not infinite, they will never be infinite, and while code optimized for AI and code optimized for humans will probably never quite be the same, they are at least correlated enough that it's still worth fighting the AI tendency to spew code out that spends code quality without investing in it.

[1]: Which is less trivial than it sounds and violated by programmers on a routine basis: https://jerf.org/iri/post/2025/fp_lessons_half_constructed_o...

[2]: https://jerf.org/iri/post/2025/fp_lessons_types_as_assertion...

tyleo•2h ago
This is close to my approach. I love copilot intellisense at GitHub’s entry tier because I can accept/reject on the line level.

I barely ever use AI code gen at the file level.

Other uses I’ve gotten are:

1. It’s a great replacement for search in many cases

2. I have used it to fully generate bash functions and regexes. I think it’s useful here because the languages are dense and esoteric. So most of my time is remembering syntax. I don’t have it generate pipelines of scripts though.

ryandrake•26m ago
> a lot of things I consider errors based on the "middle-of-the-road" code style that it has picked up from all the code it has ingested. For instance, I use a policy of "if you don't create invalid data, you won't have to deal with invalid data"

Yea, this is something I've also noticed but it never frustrated me to the point where I wanted to write about it. Playing around with Claude, I noticed it has been trained to code very defensively. Null checks everywhere. Data validation everywhere (regardless of whether the input was created by the user, or under the tight control of the developer). "If" tests for things that will never happen. It's kind of a corporate "safe" style you train junior programmers to do in order to keep them from wrecking things too badly, but when you know what you're doing, it's just cruft.

For example, it loves to test all my C++ class member variables for null, even though there is no code path that creates an incomplete class instance, and I throw if construction fails. Yet it still happily whistles along, checking everything for null in every method, unless I correct it.

PaulHoule•43m ago
My experience with AI coding is mixed.

In some cases I feel like I get better quality at slightly more time than usual. My testing situation in the front end is terribly ugly because of the "test framework can't know React is done rendering" problem but working with Junie I figured out a way to isolate object-based components and run them as real unit test with mocks. I had some unmaintainable Typescript which would explode with gobbledygook error messages that neither Junie or I could understand whenever I changed anything but after two days of talking about it and working on it it was an amazing feeling to see that the type finally made sense to me at Junie at the same time.

In cases where I would have tried one thing I can now try two or three things and keep the one I like the best. I write better comments (I don't do the Claude.md thing but I do write "exemplar" classes that have prescriptive AND descriptive comments and say "take a look at...") and more tests than I would on my on my own for the backend.

Even if you don't want Junie writing a line of code it shines at understanding code bases. If I didn't understand how to use an open source package from reading the docs I've always opened it in the IDE and inspected the code. Now I do the same but ask Junie questions like "How do I do X?" or "How is feature Y implemented?" and often get answers quicker than digging into unfamiliar code manually.

On the other hand it is sometimes "lights on and nobody home", and for a particular patch I am working on now it's tried a few things that just didn't work or had convoluted if-then-else ladders that I hate (even if I told it I didn't like that) but out of all that fighting I got a clear idea of where to put the patch to make it really simple and clean.

But yeah, if you aren't paying attention it can slip something bad past you.

phartenfeller•3h ago
Definitely. But AI can also generate unit tests.

You have to be careful by exactly telling the LLM what to test for and manually check the whole suite of tests. But overall it makese feel way more confident over increasing amounts of generated code. This of course decreases the productivity gains but is necessary in my opinion.

And linters help.

SketchySeaBeast•3h ago
I've been using Claude sonnet 4.5 lately and I've noticed a tendency for it to create tests that prove themselves. Rather than calling the function we're hoping to test, it re-implements the code in the test and then tests it there. It's helpful, and it usually works very well if you have well defined inputs and outputs, I much prefer it over writing tests manually, but you have to be very careful.
stuaxo•3h ago
It doesn't generate good tests by default though.

I worked on a team where we had someone come in and help us improve our tests a lot.

The default LLM generated tests are bit like the ones I wrote before that experience.

dnautics•3h ago
this is solvable by prompting and giving good examples?
strangescript•3h ago
Do they consider code readability, formatting and variable naming as "errors" for the overall count. That seems dubious given where we are headed.

No one cares what a compiler or js minifier names its variables in its output.

Yes, if you don't believe we will get there ever, then this is totally valid complaint. You are also wrong about the future.

oblio•3h ago
The "future" is a really long time.

I'll take the other side of your bet for the next 10 years but I won't take it for the next 30 years.

In that spirit, I want my fusion reactor and my flying car.

strangescript•2h ago
If your outlook is 10 years then for sure, its valid. I am not sure how you come to that conclusion logically though. At the beginning of the year we had 0 code agents. Now we have dozens, some are basically free, (of various degrees of quality, sure).

The last 2-3 months of releases have been an unprecedented whirlwind. Code writing will be solved by the end of 2026. Architecture, maybe not, but formatting issues isn't architecture.

oblio•2h ago
It's similar with every technology, there's a reason we have sigmoids.

In 1960 they were planning nuclear powered cars and nuclear mortars.

bopbopbop7•2h ago
Code writing was solved in 1997 when Dreamweaver was released.
oblio•1h ago
Nope, it was solved with Visual Basic in 1991. And with Nextstep in 1989. And with...

I really dislike people comparing GenAI with compilers. Compilers largely do mechanic transformations, they do almost 0 logic changes (and if they do, they're bugs).

We are in an industry that's great at throwing (developing) and really bad at catching (QA) and we've just invented the machine gun. For some reason people expect the machine gun to be great at catching, or worse, they expect to just throw things continuously and have things working as before.

There is a lot of software for which bugs (especially data handling bugs) don't meaningfully affect its users. BUT there isn't a lot of software we use daily and rely on for which that's the case.

I know that GenAI can help with QA, but I don't really see a world where using GenAI for both coding and QA gets us to where we want to go, unless as some people say, we start using formal verification (or other very rigorous and hopefully automatable advanced verification), at which point we'll have invented a new category of programmers (and we will need to train all of them since the vast majority of current developers don't know about or use formal verification).

cgearhart•3h ago
So…great for prototyping (where velocity rules) but somewhere between mixed to negative for critical projects. Seems like this just puts some mildly quantitative numbers behind the consensus & trends I see emerging.
GoatInGrey•7m ago
I'm seeing parallels between this and factory-assembled houses.

Input costs are lower and velocity is higher. You get a finished product out the door quicker, though maintenance is more expensive. Largely because the product is no longer a collection of individual parts made to be interfaced by a human. It is instead a machine-assembled good that requires a machine to perform "the work". Therefore, because the machine is only designed to assemble the good, your main recourse is to have the machine assemble a full replacement.

With that framing, there seems to be a tradeoff to bear in mind when considering fit for the problem we're meaning to solve. It also explains the widespread success of LLMs generating small scripts and MVPs. Which are largely disposable.

everdrive•3h ago
Sounds like what companies have been scrambling for this whole time. People just want to dump something out there. They don't really care if it works very well.
0x3f•3h ago
At best this would be 1.7x more _discovered_ bugs. The average PR (IMO) is hardly checked. AI could have 10x as many real issues on PRs, but we're just bad at reviewing PRs.
bodge5000•3h ago
As has already been said, we've been here before. I could ship significantly faster if I ignored any error handling or edge cases and basically just assumed the data would flow 100% how I expect it to all the time. Of course that is almost never the case, so I'd end up with more bugs.

I'd like to say that AI just takes this to an extreme but I'm not even sure about that, I think it could produce more code and more bugs than I could in the same amount of time but not significantly so if I just gave up on caring about anything

nerdjon•3h ago
Something I have been very curious about for some time now. We know the quality of the code is not very high and has a high likelihood of bugs.

But, assuming there are not bugs and the code ships. Has there been any study in resource usage creeping up and an impact of this on a whole system. The tests I have done with trying to build things with AI it always seems like there is zero efficiency unless you notice it and can put it in the right direction.

I have been curious about the impact this will have on general computing as more low quality code makes it into applications we use every day.

windex•3h ago
I think devs have now split into two camps, the kvetchers and the shippers. It's a new tool, it's fresh. Things will work itself out over the next couple of years/months(?). The kvetching helps keep AI research focused on the problem which is good. Meanwhile continue to ship.
SideburnsOfDoom•3h ago
> ship faster but it produces more bugs

This is ... not actually faster.

mmastrac•3h ago
In the pre-AI days I worked on a system like this that was constructed by a high-profile consulting team but continuously lost data and failed to meet even the basic standards.

I think I've seen so much rush-shipped slop (before and after) that I'm really anxiously waiting for this bubble to pop.

I have yet to be convinced that AI tooling can provide more than 20% or so speedup for an expert developer working in a modern stack/language.

yomismoaqui•3h ago
Agentic AI coding is a tool, you can use it wrong.

To give an example of how to use AI successfully check the following post:

https://friendlybit.com/python/writing-justhtml-with-coding-...

cmiles8•3h ago
There are certainly some valid criticisms of vibe coding. That said, it’s not like the quality of most code was amazing before AI came along. In fact, most code is generally pretty terrible and took far too long for teams to ship.

Many folks would say that if shipping faster allows for faster iterations across an idea then the silly errors are worth it. I’ve certainly seen a sharp increase on execs calling BS on dev teams saying they need months to develop some basic thing.

tyleo•3h ago
I think you need a balance. I’ve seen products fall apart due to high error rate.

I like to think of intentionalists—people who want to understand systems—and vibe coders—people who just want things to work on screen expediently.

I think success requires a balance of both. The current problem I see with AI is that it accelerates the vibe part more than the intentionalist part and throws the system out of balance.

cmiles8•2h ago
Don’t disagree… I think it’s just applying a lot more pressure on dev teams to do things faster though. Devs tend to be expensive and expectations on productivity have increased dramatically.

Nobody wants teams to ship crap, but also folks are increasingly questioning why a bit of final polishing takes so long.

jmathai•2h ago
More important than code quality is a joint understanding of the business problem and the technical solution for it. Today, that understanding is spread across multiple parties (eng, pm, etc).

Code quality can be poor as long as someone understands the tradeoffs for why it's poor.

coliveira•2h ago
When a team says that a "trivial" feature takes months to ship is not because of the complexity of the algorithm. It's because of the infrastructure and coordination work required for the feature to properly work. It is almost aways a failure of the technical infrastructure previously created in the company. An AI will solve the trivial aspects of the problem, not the real problem.
dj_gitmo•1h ago
> It is almost aways a failure of the technical infrastructure previously created in the company. An AI will solve the trivial aspects of the problem, not the real problem.

This is so true. Software that should be simple can become so gnarly because of bad infra. For example, our CI/CD team couldn't get updated versions of Python on the CI machines, and so suddenly we need to start using Docker for what should be a very simple software. That's just an example, but you get the idea, and it causes problems to compound over the years.

You really want good people with sharp elbows laying the foundations. At one time I resented people like that, but now I have seen what happens when you don't have anyone like that making technical decisions.

WhyOhWhyQ•2h ago
And you think people who don't understand the software telling people who do they're doing it wrong is an outright positive?
Aurornis•2h ago
> I’ve certainly seen a sharp increase on execs calling BS on dev teams saying they need months to develop some basic thing.

Some of the teams I worked with in the years right before AI coding went mainstream had become really terrible about this. They would spend months forming committees, writing documents, getting sign-offs and approvals, creating Gantt charts, and having recurring meetings for the simplest requests.

Before I left, they were 3 months deep into meetings about setting up role based access control on a simple internal CRUD app with a couple thousand users. We needed about 2-3 roles. They were into pros and cons lists for every library and solution they found, with one of the front runners involving a lot of custom development for some reason.

Yet the entire problem could have been solved with 3 Boolean columns in the database for the 3 different roles. Any developer could have done it in an afternoon, but they were stuck in a mindset of making a big production out of the process.

I feel like LLMs are good at getting those easy solutions done. If the company really only needs a simple change, having an LLM break free from the molasses of devs who complicate everything is a breath of fresh air.

On the other hand, if the company had an actual complicated need with numerous and changing roles over time, the simple Boolean column approach would have been a bad idea. Having people who know when to use each solution is the real key.

cmiles8•3m ago
Yes. I’ve seen meetings where the dev team is going on and on about how it will take weeks to add a feature and someone calling BS just shares their screen, asks some AI agent to code it up, and it does. Is it 100% perfect? Perhaps not, but is close and does put the dev team in a spot of having to truly justify why it will take so long vs hand-wavy smoke and mirrors and “its technical you wouldn’t understand” commentary to leadership. Things have changed and I don’t think we’re going back.
dannersy•18m ago
This attitude just furthers our race to the bottom. I agree with iteration, but software quality is getting really laughable. I know we're still on the better side of what existed in the hands of consumers in the 90s, but anyway... Execs calling BS is further evidence of that race to the bottom.
exitb•3h ago
1.7x does not look that bad? If "AI code" is a broad classification that includes people using bad tools, or not being very skilful operators of said tools, then we can expect this number to meaningfully improve over time.
speed_spread•2h ago
Tell that to your customers. And tell them how much longer the bugs generated by AI will take to fix by humans. Or tell them that you'll never fix the bugs because you're too busy vibe coding new ones.
exitb•2h ago
I'm not saying bugs aren't a problem. I'm saying that if an emerging, fast improving tech is only slightly behind a human coder now, it seems conceivable that we're not that far off when they reach parity.
naasking•1h ago
Exactly. I'm sure assembly language programmers from the 1980s could easily write code that ran 2x faster than the code produced by compilers of the time, but compilers only got better and eventually assembly language programming became a rare job, and humans can rarely outperform compilers on whole program compilation.
cryptonym•1h ago
Assembly experts still write code that runs faster than code produced by compilers. Being slower is predictable and solved with better hardware, or just waiting. This is fine for most so we switched to easier or portable languages. Output of the program remains the same.

Impact of having 1.7x more bugs is difficult to assess and is not solved that easily. Comparison would work if that was about optimisations: code that is 1.7x slower / memory hungry.

naasking•58m ago
> Assembly experts still write code that runs faster than code produced by compilers.

They sometimes can, but this is no longer a guaranteed outcome. Supercompilation optimizers can often put manual assembly to shame.

> Impact of having 1.7x more bugs is difficult to assess and is not solved that easily.

Time will tell. Arguably the number of bugs produced by AI 2 years ago was much higher than 1.7x. In 2 more years it might only be 1.2x bugs. In 4 years time it might be barely measurable. The trend over the next couple of years will judge whether this is a viable way forward.

lherron•3h ago
They buried the lede. The last half of the article with ways to ground your dev environment to reduce the most common issues should be its own article. (However implementing the proper techniques somewhat obviates the need for CodeRabbit, so guess it’s understandable.)
bgwalter•2h ago
The report is from cortex.io, based on only 50 self-selected responses from "engineering leaders" as well as from idpcon.com, hosted by cortex.

All websites involved are vibe coded garbage that use 100% CPU in Firefox.

neallindsay•2h ago
1.7x more is not the same as 1.7x as many.
esafak•1h ago
It's a lost cause. "It's two times faster!"
brainless•2h ago
I use LLMs to generate almost all my code. Currently at 40K lines of Rust, backend and a desktop app. I am a senior engineer with almost all my tech career (16 years) in startups.

Coding with agents has forced me to generate more tests than we do in most startups, think through more things than we get the time to do in most startups, create more granular tasks and maintain CI/CD (my pipelines are failing and I need to fix them urgently).

These are all good things.

I have started thinking through my patterns to generate unit tests. I was generating mostly integration or end to end tests before. I started using helping functions in API handlers and have unit tests for helpers, bypassing the API level arguments (so not API mocking or framework test to deal with). I started breaking tasks down into smaller units, so I can pass on to a cheaper model.

There are a few patterns in my prompts but nothing that feels out of place. I do not use agents files and no MCPs. All sources here: https://github.com/brainless/nocodo (the product is itself going through a pivot so there is that).

WhyOhWhyQ•2h ago
I see that your release is GPL 3.0. Are you worried about LLM's effectively laundering your source code a year from now? I've become hesitant about releasing source code since LLM's, though I do use Claude heavily while programming to make suggestions and look for issues etc.., but I'd be interested in hearing your perspective.
asdfdfd•6m ago
are you Indian? don't lie
naasking•1h ago
It's totally plausible that AI codegen produces more bugs. It still seems important to familiarize yourself with these tools now though, because that bug count is only ever going to go down. These tools are here to stay.
oasisaimlessly•1h ago
Are you trying to assure others or reassure yourself?
naasking•54m ago
Just extrapolating a trend of the past few years. If you disagree, then carry on and time will tell.
esafak•1h ago
It produces more more bugs but the count goes down?!
naasking•55m ago
Did the models from 2 years ago produce more bugs, fewer bugs or the same bugs as today's models? Do you think next years AI models will produce the same number of bugs, more bugs, or fewer bugs?
geldedus•1h ago
Not for me.
827a•1h ago
Archetypes of prompts that I find AI to be quite good at handling:

1. "Write a couple lines or a function that is pretty much what four years ago I would have gone to npm to solve" (e.g. "find the md5 hash of this blob")

2. "Write a function that is highly represented and sampleable in the rest of the project" (e.g. "write a function to query all posts in the database by author_id" (which might include app-specific steps like typing it into a data model)).

3. "Make this isolated needle-in-a-haystack change" (e.g. "change the text of such-and-such tooltip to XYZ") (e.g. "there's a bug with uploading files where we aren't writing the size of the file to the database, fix that")

I've found that it can definitely do wider-ranging tasks than that (e.g. implement all API routes for this new data type per this description of the resource type and desired routes); and it can absolutely work. But, the two problems I run into:

1. Because I don't necessarily have a grokable handle on what it generated, I don't have a sense of what its missing and needed follow-on prompts to create. E.g.: I tell it to write an endpoint that allows users to upload files. A few days later, we realize we aren't MD5-hashing the files that got uploaded; there was a field in the database & resource type to store this value, but it didn't pick up on that, and I didn't prompt it to do this; so its not unreasonable. But oftentimes when I'm writing routes by hand, I'm spending so much time in that function body that follow-on requirements naturally occur to me ("Oh that's right, we talked about needing this route available to both of these two permissions, crap let me implement that"). With AI, it finishes so fast that my brain doesn't have time to remember all the requirements.

2. We've tried to mitigate this by pushing more development into the specs and requirements up-front. This is really hard to get humans to do, first of all. But more critically: None of our data supports the hypothesis that this has shortened cycle times. It mostly just trades writing typescript for reading & writing English (which few engineers I've ever worked with are actually all that good at). The engineers still end up needing long cycle times back-and-forth with the AI to get correct results, and long cycle times in review.

3. The more code you ask it to generate, the more vibeslop you get. Deeply-nested try/catch statements with multiple levels of error handling & throwing. Comments everywhere. Reimplementing the same helper functions five times. These things, we have found, raise the cost and lower the reliability & performance of future prompting, and quickly morph parts of the system into a no-man's-land (literally) where only AIs can really make any change; and every change even by the AIs get harder and harder to ship. Our reported customer issues on these parts of the app are significantly higher than others, and our ability to triage these issues is also impacted because we no longer have SMEs that can just brain-triage issues in our CS channels; everything now requires a full engineering cycle, with AI involvement, to solve.

Our engineers run the spectrum of "never wanted to touch AI, never did" to "earnestly trying to make it work". Ultimately I think the consensus position is: Its a tool that is nice to have in the toolbox, but any assertion that its going to fundamentally change the profile of work our engineers do, or even seriously impact hiring over the long-term, is outside the realm of foreseeable possibility. The models and surrounding tooling are not improving fast enough.

kristopherleads•1h ago
I really think the answer here is human-in-the-loop. Too many people are thinking that AI is a full on drop-in replacement for engineers or managers, but ultimately having it be an augment is the magic. I work at FlowFuse so super biased, but that's something I've really enjoyed with our MCP and Expert Assistant - it's built to help you, not to replace you, so you can ask questions, get insights, etc. faster.

I suppose the tl;dr is if you're generating bugs in your flow and they make it to prod, it's not a tool problem - it's a cultural one.

sailfast•1h ago
How many more bugs does it produce if we use CodeRabbit to review PRs? I assume the number will be less? (Asking seriously and hopefully if the product will help or would’ve caught the bugs, while also pointing out the natural conclusion of the article is to purchase your service :) )
jampa•14m ago
I've used it in some repos. It doesn't catch all code review issues, especially around product requirements and logic simplification, and occasionally produces irrelevant comments (suggesting a temporary model downgrade).

But it's well worth it. It has saved me some considerable time. I let it run first, even before my own final self-review (so if others do the same, the article's data might be biased). It's particularly good at identifying dead code and logical issues. If you tune it with its own custom rules (like Claude.md), you can also cut a lot of noise.

carra•1h ago
Am I the only one thinking that 1.7x is a very weird way of saying "70% more"? It's even wrong since, like other comments point out, 1.7x MORE would in fact be 2.7 times as much. Which is not what the bug numbers say.
visarga•15m ago
AI helps ship faster but we need to code 1.7x more tests (with AI) and it's allright
kkarpkkarp•11m ago
I can't find if they deducted false-positives before they count the results. I've been using CodeRabbit heavily and like any other AI code reviewing tools it was having a lot of them.

Like for example: found missing data validation / sanitization reported, only because the code has already been sanitized / validated but this is not visible in the diff.

You can tell CodeRabbit he is wrong about this and the tool accepts it then, though.

TheAnkurTyagi•10m ago
code reviews take way longer now because you gotta actually read through everything instead of trusting the dev knew what they were writing. Its like the AI is great at the happy path but completely misses edge cases or makes weird assumptions about state...

The real kicker is when someone copies AI generated code without understanding it and then 3 months later nobody can figure out why production keeps having these random issues. debugging AI slop is its own special hell