frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Gemini 3 Flash: Frontier intelligence built for speed

https://blog.google/products/gemini/gemini-3-flash/
600•meetpateltech•5h ago•281 comments

How SQLite is tested

https://sqlite.org/testing.html
163•whatisabcdefgh•3h ago•33 comments

Inside PostHog: SSRF, ClickHouse SQL Escape and Default Postgres Creds to RCE

https://mdisec.com/inside-posthog-how-ssrf-a-clickhouse-sql-escaping-0day-and-default-postgresql-...
30•arwt•1h ago•4 comments

Show HN: High-Performance Wavelet Matrix for Python, Implemented in Rust

https://pypi.org/project/wavelet-matrix/
34•math-hiyoko•2h ago•0 comments

Coursera to combine with Udemy

https://investor.coursera.com/news/news-details/2025/Coursera-to-Combine-with-Udemy-to-Empower-th...
346•throwaway019254•9h ago•199 comments

AWS CEO says replacing junior devs with AI is 'one of the dumbest ideas'

https://www.finalroundai.com/blog/aws-ceo-ai-cannot-replace-junior-developers
584•birdculture•5h ago•336 comments

A Safer Container Ecosystem with Docker: Free Docker Hardened Images

https://www.docker.com/blog/docker-hardened-images-for-every-developer/
220•anttiharju•5h ago•50 comments

I got hacked: My Hetzner server started mining Monero

https://blog.jakesaunders.dev/my-server-started-mining-monero-this-morning/
52•jakelsaunders94•1h ago•41 comments

Here is the 15 sec coding test I used to instantly filter out most applicants

https://josezarazua.com/im-a-former-cto-here-is-the-15-sec-coding-test-i-used-to-instantly-filter...
5•kevin061•32m ago•1 comments

Tell HN: HN was down

394•uyzstvqs•5h ago•248 comments

Zmij: Faster floating point double-to-string conversion

https://vitaut.net/posts/2025/faster-dtoa/
61•fanf2•3d ago•4 comments

Fast Sequence Iteration in Common Lisp

https://world-playground-deceit.net/blog/2025/12/fast-sequence-iteration-in-common-lisp.html
10•BoingBoomTschak•4d ago•0 comments

Cloudflare Radar 2025 Year in Review

https://radar.cloudflare.com/year-in-review/2025
7•ksec•29m ago•0 comments

Launch HN: Kenobi (YC W22) – Personalize your website for every visitor

23•sarreph•5h ago•40 comments

The State of AI Coding Report 2025

https://www.greptile.com/state-of-ai-coding-2025
50•dakshgupta•5h ago•60 comments

Flick (YC F25) Is Hiring Founding Engineer to Build Figma for AI Filmmaking

https://www.ycombinator.com/companies/flick/jobs/Tdu6FH6-founding-frontend-engineer
1•rayruiwang•5h ago

Notes on Sorted Data

https://amit.prasad.me/blog/sorted-data
48•surprisetalk•6d ago•7 comments

VRChat: “There are more Japanese creators than all other countries combined”

https://twitter.com/chyadosensei/status/2001356290531156159
32•numpad0•1h ago•13 comments

Doublespeed hacked, revealing what its AI-generated accounts are promoting

https://www.404media.co/hack-reveals-the-a16z-backed-phone-farm-flooding-tiktok-with-ai-influencers/
132•grahamlee•3h ago•69 comments

I couldn't find a logging library that worked for my library, so I made one

https://hackers.pub/@hongminhee/2025/logtape-fedify-case-study
18•todsacerdoti•5d ago•19 comments

Announcing the Beta release of ty

https://astral.sh/blog/ty
803•gavide•1d ago•149 comments

I created a publishing system for step-by-step coding guides in Typst

https://press.knowledge.dev/p/new-150-pages-rust-guide-create-a
24•deniskolodin•4d ago•6 comments

AI Isn't Just Spying on You. It's Tricking You into Spending More

https://newrepublic.com/article/204525/artificial-intelligence-consumers-data-dynamic-pricing
33•c420•1h ago•5 comments

Learning Fortran (2024)

https://uncenter.dev/posts/learning-fortran/
39•lioeters•8h ago•41 comments

No AI* Here – A Response to Mozilla's Next Chapter

https://www.waterfox.com/blog/no-ai-here-response-to-mozilla/
506•MrAlex94•1d ago•282 comments

Is Mozilla trying hard to kill itself?

https://infosec.press/brunomiguel/is-mozilla-trying-hard-to-kill-itself
776•pabs3•12h ago•689 comments

Thin desires are eating life

https://www.joanwestenberg.com/thin-desires-are-eating-your-life/
709•mitchbob•1d ago•235 comments

AI's real superpower: consuming, not creating

https://msanroman.io/blog/ai-consumption-paradigm
196•firefoxd•13h ago•136 comments

Pornhub extorted after hackers steal Premium member activity data

https://www.bleepingcomputer.com/news/security/pornhub-extorted-after-hackers-steal-premium-membe...
24•coloneltcb•1h ago•3 comments

TLA+ Modeling Tips

http://muratbuffalo.blogspot.com/2025/12/tla-modeling-tips.html
106•birdculture•14h ago•28 comments
Open in hackernews

The State of AI Coding Report 2025

https://www.greptile.com/state-of-ai-coding-2025
50•dakshgupta•5h ago

Comments

dakshgupta•5h ago
Hi, I'm Daksh, a co-founder of Greptile. We're an AI code review agent used by 2,000 companies from startups like PostHog, Brex, and Partiful, to F500s and F10s.

About a billion lines of code go through Greptile every month, and we're able to do a lot of interesting analysis on that data.

We decided to compile some of the most interesting findings into a report. This is the first time we've done this, so any feedback would be great, especially around what analytics we should include next time.

wrs•1h ago
It’s hard to reach any conclusion from the quantitative code metrics in the first section, because as we all know, more code is not necessarily better. “Quantity” is not actually the same as “velocity”. And that gets to the most important question people have about AI assistance: does help you maintain a codebase long term, or does it help you fly headlong into a ditch?

So, do you have any quality metrics to go with these?

dakshgupta•1h ago
We weren’t able to find a good quality measure. LLM-as-judge dint feel right. You’re correct that without that the data is interesting but not particular insightful.
ChrisbyMe•1h ago
Hey! Thanks for publishing this.

Would be interested in seeing the breakdown between uplift vs company size.

e.g. I work in a FAANG and have seen an uptick in the number of lines on PRs, partially due to AI coding tools and partially due to incentives for performance reviews.

dakshgupta•47m ago
This is a good one, wish we had included it. I'd run some analysis on this a while ago and it was pretty interesting.

An interesting subtrend is that Devin and other full async agents write the highest proportion of code at the largest companies. Ticket-to-PR hasn't worked nearly as well for startups as it has for the F500.

neom•1h ago
If AI tools are making teams 76% faster with 100% more bugs, one would presume you're not more productive you're just punting more debt. I'm no expert on this stuff, but coupling it with some type of defect density insights might be helpful. Would be also interested to know what percentage of AI assisted code is "rolled back" or "reverted" within 48 hours. Has there been any change in number of review iterations over time?
jacekm•45m ago
> About a billion lines of code go through Greptile every month, and we're able to do a lot of interesting analysis on that data.

Which stats in the report come from such analysis? I see that most metrics are based on either data from your internal teams or publicly available stats from npm and PyPi.

Regardless of the source, it's still an interesting report, thank you for this!

dakshgupta•35m ago
Thanks! The first 4 charts as well as Chart 2.3 are all from our data!
chis•42m ago
Wish you'd show data from past years too! It's hard to know if these are seasonal trends or random variance without that.

Super interesting report though.

psunavy03•1h ago
Sigh . . . once again I see "velocity" as something to be increased.

This makes me metaphorically stabby.

dakshgupta•1h ago
We were trying not to insinuate that, because we don’t have a good way to measure quality, without which velocity is useless.
locusofself•1h ago
This is definitely interesting information and I plan to take a deeper look at it.

What a lot of us must be wondering though is:

- how maintainable is the code being outputted

- how much is this newfound productivity saving (costing) on compute, given that we are definitely seeing more code

- how many livesite/security incidents will be caused by AI generated code that hasn't been reviewed properly

dakshgupta•1h ago
We weren’t able to agree on a good way to measure this. Curious - what’s your opinion on code churn as a metric? If code simply persists over some number of months, is that indication it’s good quality code?
wordpad•50m ago
I've seen code entropy as the suggested hueriatic to measure.
arcwhite•23m ago
I've seen code persist a long time because it is unmaintainable gloop that takes forever to understand and nobody is brave enough to rebuild it.

So no, I don't think persistence-through-time is a good metric. Probably better to look at cyclomatic complexity, and maybe for a given code path or module or class hierarchy, how many calls it makes within itself vs to things outside the hierarchy - some measure of how many files you need to jump between to understand it

nekooooo•1h ago
i'm a designer and even i know not to measure 'lines of code' as meaningful output or impact. are we really doing this?
dakshgupta•1h ago
We expressly did not conclude that more lines = better. You could easily argue more lines = worse. All we wanted to show is that there are more lines.
poliphili•32m ago
Language like "productivity gains", "output" and "force multiplier" isn't neutral like you're claiming here, and does imply that the line count metric indicates value being delivered for the business.
simonw•1h ago
> Lines of code per developer grew from 4,450 to 7,839 as AI coding tools act as a force multiplier.

Is that a per-year number?

If a year has 200 working days that's still only about 40 lines of code a day.

When I'm in full-blown work mode with a decent coding agent (usually Claude Code) I'm genuinely producing 1,000+ lines of (good, tested, reviewed) code a day.

Maybe there is something to those absurd 10x multiplier claims after all!

(I still think there's plenty of work done by software engineers that isn't crunching out code, much of which isn't accelerated by AI assistance nearly as much. 40 lines of code per day felt about right for me a few years ago.)

rnewme•1h ago
1k loc per day or 1k git additions? I don't think one person can consistently review 1k loc, and grow codebase at that speed and size and classify it as good, tested and reviewed.. Can you tell us more about your process?
simonw•1h ago
I'm effectively no longer typing code by hand: I decide what change I want to make and then prompt Claude Code to describe that change. Sometimes I'll have it figure out the fix too.

An example from earlier today: https://github.com/simonw/llm-gemini/commit/fa6d147f5cff9ea9...

That commit added 33 lines and removed 13 - so I'm already at a 20-lines-a-day level just from that one commit (and I shipped a few more plus a release of llm-gemini: https://github.com/simonw/llm-gemini/commits/a2bdec13e03ca8a...)

It took about 3.5 minutes. I started from this issue someone had filed against my repo:

Then I opened Claude Code and said:

  Run this command: uv run llm -m gemma-3-27b-it hi 
That ran the command and returned the error message. I then said:

  Yes, fix that - the gemma models do not support media resolution
Which was enough for it to figure out the fix and run the tests to confirm it hadn't broken anything.

I ran "git diff", thought about the change it had made for a moment, then committed and pushed it.

Here's the full Claude Code transcript: https://gistpreview.github.io/?62d090551ff26676dfbe54d8eebbc...

I verified the fix myself by running:

  uv run llm -m gemma-3-27b-it hi
I pasted the result into an issue comment to prove to myself (and anyone else who cares) that I had manually verified the fix: https://github.com/simonw/llm-gemini/issues/116#issuecomment...

Here's a more detailed version of the transcript including timestamps, showing my first prompt at 10:01:13am and the final response at 10:04:55am. https://tools.simonwillison.net/claude-code-timeline?url=htt...

I built that claude-code-timeline application this morning too, and that thing is 2284 lines of code: https://github.com/simonw/tools/commits/main/claude-code-tim... - but that was much more of a vibe-coded thing, I hardly reviewed the code that was written at all and shipped it as soon as it appeared to work correctly. Since it's a standalone HTML file there's not too much that can go wrong if it has bugs in it.

WhyOhWhyQ•1h ago
Whenever I start reviewing code produced by Claude I find hundreds of ways to improve it.

I don't know if code quality really matters to most people or to the bottom line, but a good software engineer writes better code than Claude. It is a testament to library maintainers that Claude is able to code at all, in my opinion. One reason is that Claude uses API's in whacky ways. For instance by reading the SDL2 documentation I was able to find many ways that Claude writes SDL2 using archaic patterns from the SDL days.

I think there are a lot of hidden ways AI booster types benefit from basic software engineering practices that they actively promote damaging ideas about. Maybe it will only be 10 years from now that we learn that having good engineers is actually important.

simonw•14m ago
> Whenever I start reviewing code produced by Claude I find hundreds of ways to improve it.

Same here. So I tell it what improvements I want to make and watch it make them.

I've gained enough experience at prompting it that it genuinely is faster for me to tell it the change I want to make than it is for me to make that change myself, 90% of the time.

lumost•1h ago
There is a long tail of engineers working on mature/stable code bases where there are fewer extremely large diffs, or the review burden is extremely high. If you work on core software - then you can never say that a line of code was wrong "because of the AI." e.g. places where you might need 2-3x code approvers or more.
cmdtab•1h ago
I saw your example and it was a simple cli tool. Of course you can have claude make commits effectively to it!
simonw•1h ago
Totally. I have dozens of "simple CLI tools" that I work on - and small plugins, and HTML+JavaScript utilities.

If I was hacking on the Linux kernel I would be delighted with myself for producing 40 lines of landed code in a single day.

eikenberry•34m ago
They are obviously talking about writing code against expectations greater than these simple tools. Why troll with the hyperbole?
leothetechguy•1h ago
I couldn't in good conscience work like that, I believe the risk of bad AI generated code due to the tiniest of output variation is far too high. Especially in systems that need to maintain a large state governed by many rules and edge cases.
dakshgupta•1h ago
This is per month, I see now that's not super clear on the chart!
CrzyLngPwd•1h ago
1,000 lines of debt that you didn't review and probably have no idea what they do.
AlexandrB•1h ago
Yeah, I don't get it. It's well know that "LOC" is not a good metric of developer productivity. But now that AI is writing those lines of code, it's fine as a metric?
noosphr•1h ago
Senior developers know that every line of code is debt. Junior developers think that every line of code is wealth.
noosphr•1h ago
I'm a good aerospace engineer, my rockets weigh an extra 50kg after every day I work on them.
WhyOhWhyQ•1h ago
You're writing Python and Javascript right? Those languages are extremely easy to write in (which conversely means the legibility is likely to be poor). People maintaining legacy systems in systems level languages aren't going to be able to produce as much code as people writing Python and Javascript.
simonw•7m ago
Yes, mostly Python and JavaScript and SQL. I'm dabbling a little more with Go these days too.
observationist•1h ago
If you actually work, the amount of work you do is absurdly more than the amount of work most others do, and a lot of the time, both the high and low productivity people assume everyone just does as much as they do, in both directions.

A lot of people are oblivious to Zipf distributions in effort and output, and if you ever catch on to it as a productive person, it really reframes ideas about fairness and policy and good or bad management.

It also means that you can recognize a good team, and when a bunch of high performers are pushing and supporting eachother and being held to account out in the open, amazing things happen that just make other workplaces look ridiculous.

My hope for AI is that instead of 20% of the humans doing 80% of the work, you end up with force multipliers, and a ramping up, so that more workplaces look like high function teams, making everything more fair and engaging and productive, but i suspect once people get better with AI, at least up to the point of AGI, is we're going to see the same distribution but 10x or 50x the productivity.

waterproof•1h ago
Looks like it's a monthly number.
zkmon•1h ago
I take this "code-output" metrics with a pinch of salt. Ofcourse, a machine can generate 1000 times more lines of code similar to a power loom does. However, the comparison with power loom ends there.

How maintainable is this code output? I saw a SPA html file produced by a model, which appeared almost similar to assembly code. So if the code can only be maintained by model, then an appropriate metric should should be based on a long-term maintainability achieved, but not on instant generation of code.

hvb2•1h ago
Agreed, I stopped reading at that point. You can't take yourself seriously to create a report and use LOC as your measure.

I feel like we humans try to separate things and keep things short. We do this not because we think it's pretty, we do it so our human brains can still reason about a big system. As a result LOC is a bad measure as being concise then hurts your productivity????

dakshgupta•1h ago
We're careful not to draw any conclusions from LoC. The fact is LoCs are higher, which by itself is interesting. This could be a good or bad thing depending on code quality, which itself varied wildly person-to-person and agent-to-agent.
mrdependable•1h ago
Can you expand on why it is interesting?
zed31726•1h ago
Because it's different. Change is important to track
a_imho•1h ago
My point today is that, if we wish to count lines of code, we should not regard them as "lines produced" but as "lines spent": the current conventional wisdom is so foolish as to book that count on the wrong side of the ledger.

As a dev I very much subscribe to this line of thought, but I also have to admit most of the business class people would disagree.

dakshgupta•48m ago
How would you measure code quality? Would persistence be a good measure?
epicureanideal•22m ago
Bad code can persist because nobody wants to touch it.

Unfortunately I’m not sure there are good metrics.

scuff3d•19m ago
That question has been baffling product managers, scrum masters, and C-suite assholes for decades. Along with how you measure engineering productivity.
scuff3d•26m ago
It shouldn't be taken with a pinch of salt, it should be disregarded entirely. It's an utterly useless metric, and given that the report leads with it makes the entire thing suspect.
apercu•49s ago
When I was first learning Perl after being a shell scripter/sysadmin I produced a lot of code. 2-3 years later the same tasks would be way less code. So is more code good?

Also, my anecdotal experience is that LLM code is flat wrong sometimes. Like a significant percentage. I can't quote a number really, because I rarely do the same thing/similar thing twice. But it's a double digit percentage.

TuringNYC•1h ago
Kudos to the designer, this site is beautiful.
a1ff00•1h ago
Was going to comment the same. Love the dot matrix paper look.
dionian•1h ago
agreed. was it AI ?! not that i care - ive been doing a lot of tailwind apps in ai with great success. AI is great for the web, takes all the tedium out of it
superchris•51m ago
This thing that can't be measured is up 76%. Eyeroll
vb-8448•50m ago
In the engineering team velocity section, the most important metric is missing: change rate of new code or how many times it is change before being fully consolidated.
dakshgupta•49m ago
This is a great suggestion. I'll note it down for next years. Curious, do you think this would be a good proxy for code quality?
all2•43m ago
I would consider feature complete with robust testing to be a great proxy for code quality. Specifically, that if a chunk of code is feature complete and well tested and now changing slowly, it means -- as far as I can tell -- that the abstractions contained are at least ok at modeling the problem domain.

I would expect code that continually changes and deprecates and creates new features is still looking for a good problem domain fit.

dakshgupta•34m ago
Most of our customers are enterprises, so I feel relatively comfortable assuming they have some decent testing and QA in place. Perhaps I am too optimistic?
vb-8448•34m ago
It's tricky, but one can assume that code written once and not touched in a while is good code (didn't cause any issues, performance is good enough, ecc).

I guess you can already derive this value if you sum the total line changed by all PRs and divide it by (SLOC end - SLOC start). Ideally it must be a value slightly greater than 1.

sillyfluke•33m ago
It depends on how well you vetted your sanples.

fyi: You headline with "cross-industry", lead with fancy engineering productivity graphics, then caption it with small print saying its from your internal team data. Unless I'm completely missing something, it comes of as a little misleading and disingenuous. Maybe intro with what your company does and your data collection approach.

dakshgupta•27m ago
Apologies, that is poor wording on our part. It's internal data from engineers that use Greptile, which are tens of thousands of people from a variety of industries. As opposed to external, public data, which is where some of the charts are from.
magicloop•9m ago
Your graphs roughly marry up with my anecdotal experience. After a while, when you know when and how to utilize LLMs/agents, coding does become more productive. There is a discernible improvement in productivity at the same quality level.

Also I notice it when the LLMs are offline. It feels a bit like when the internet connect fails. You remember the old days of lower productivity.

Of course, there is a lot of junk/silly ways to approach these tools but all tools are just a lever, and need judgement/skill to use them well.