frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

We Are Changing Our Developer Productivity Experiment Design

https://metr.org/blog/2026-02-24-uplift-update/
26•ej88•4h ago

Comments

ej88•3h ago
Really interesting updates to their 2025 experiment.

Repeat devs from the original experiment went from 0-40% slowdown to now -10-40% speedup - and METR estimates this as a 'lower-bound'

more devs saying they dont even want to do 50% of their work without AI, even for 50/hr

30-50% of devs decided not to submit certain tasks without AI, missing the tasks with the highest uplift

it also seems like there is a skill gap - repeat devs from the first study are more productive with ai tools than newly recruited ones with variable experience

overall it seems like the high preference for devs to use AI is actually hurting METR's ability to judge their speedup, due to a refusal to do tasks without it. imo this is indirectly quite supportive for ai coding's productivity claims.

roxolotl•3m ago
The finding of the first study was people cannot judge their performance with these tools. So I don’t think the lack of individuals not willing to work without them is indicative of productivity improvements. I think it’s indicative of them being enjoyable to use.
softwaredoug•3h ago
I'm a bit perplexed by the developer selection effects.

I get that developers want to use AI. But are they also claiming there's not still a no/low-AI population of developers? Or that their means of selection don't find these developers?

Are they worried that by splitting devs into groups of AI experience they might be measuring some confounder that causes people to choose AI / not AI in their careers?

sgillen•1h ago
The study was designed to have devs who are comfortable with AI perform 50% of tasks with AI and 50% without. So the problem is the population of "Developers who use AI regularly but are willing to do tasks without AI" is shrinking.

>> Are they worried that by splitting devs into groups of AI experience they might be measuring some confounder that causes people to choose AI / not AI in their careers?

The developer sample size was small (16 people in the original study) and the task sample size is larger (~250 tasks). I think the worry is variance in developer productivity would totally wash out any signal.

selridge•1h ago
Here is my read:

Developers are refusing to complete the survey or selecting themselves out because they (apparently) don’t want to complete the non-AI task.

The also saw selection effects from a large reduction in the pay for the study (which is an unfortunate confounder here), 150/hr -> 50/hr.

They guess this makes their estimates lower bounds, but the selection effect is complicated (which they acknowledge).

Overall this is a hard problem for them in the current state. It will be challenging to produce convincing year over year analysis under these conditions.

sgillen•1h ago
This is very interesting because I see a lot of AI detractors point to the original study as proof that AI is overhyped and nothing to worry about. In this new study the findings are essentially reversed (20% slowdown to 20% speedup).
ej88•1h ago
not enough people look at the slope, just the coords
selridge•59m ago
I think their old findings were hard to treat as gospel just due to the kind of comparison + the sample, but this new result is probably much noisier.

It’s hard to make reliable, directional assumptions about the kind of self-selection and refusal they saw, even without worrying about the reward dropping 66%.

simonw•45m ago
AI detractors loved that previous study so much. It seems to have been brought up in the majority of conversations about AI productivity over the past six months.

(Notable to me was how few other studies they cited, which I think is because studies showing AI productivity loss are quite uncommon.)

arctic-true•1h ago
Those developer quotes are tough to read. Rate limits are going to hit like a truck when the labs eventually need to make a profit.
simonw•46m ago
At this point the AI labs would pretty much have to form an illegal price fixing cartel in order to jack the prices up, they've been competing to drive down prices for so long.

They'd have to get the Chinese AI labs to go along with that price fixing too.

arctic-true•13m ago
They’d have an entire country of geniuses prepared to defend against the antitrust allegations, who’s to stop them? /s
camgunz•1h ago
Unless this measures the entire SDLC longitudinally (like say, over a year) I'm not interested. I too can tell Claude Code to do things all day every day, but unless we have data on the defect rate it doesn't matter at all.
Bnjoroge•51m ago
never been a better time to be a swe who doesnt or significantly limits the use of AI agents
atleastoptimal•46m ago
It's kind of funny that METR is known primarily for both the most bearish study on AI progress (the original 20% slowdown one), and the most bullish one on AI progress (the long-task horizon study showing exponential increase in duration of tasks AI models can accomplish with respect to date of release).

In either case, it seems people often bolstered preexisting views on AI to whichever study most affirmed them (for the former, that AI coding models didn't actually help and created a mirage of productivity that required more work to fix than was worth it, the latter that AI models were improving at an exponential rate and will invariably eclipse SWE's in all tasks in a deterministic amount of time.)

I think the truth is somewhere in the middle. Just anecdotally we've seen multi-million dollar fortunes being minted by small teams developing using 90% AI-assisted coding. Anthropic claims they solely use agents to code and don't modify any code manually.

daxfohl•21m ago
"I don't want to do this without AI" sounds like we're already well into the brain atrophy stage of this. Now what? (I'd think about it myself but....)
marcosdumay•1m ago
"I avoid issues like AI can finish things in just 2 hours, but I have to spend 20 hours. I will feel so painful if the task is decided as AI-disallowed."

What really doesn't sound like the results they got where developers may get up to twice as productive on the best scenario.

There's surely something scary there. And the lack of people ambivalent about AI isn't a certain indication it's well accepted as they think, it can just as easily be caused by polarization.

I'm helping my dog vibe code games

https://www.calebleak.com/posts/dog-game/
541•cleak•6h ago•162 comments

Mac mini will be made at a new facility in Houston

https://www.apple.com/newsroom/2026/02/apple-accelerates-us-manufacturing-with-mac-mini-production/
270•haunter•2h ago•268 comments

Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3

https://github.com/moonshine-ai/moonshine
57•petewarden•2h ago•11 comments

Hacking an old Kindle to display bus arrival times

https://www.mariannefeng.com/portfolio/kindle/
141•mengchengfeng•4h ago•26 comments

Nearby Glasses

https://github.com/yjeanrenaud/yj_nearbyglasses
198•zingerlio•6h ago•84 comments

Cell Service for the Fairly Paranoid

https://www.cape.co/
21•0xWTF•1h ago•13 comments

Show HN: Emdash – Open-source agentic development environment

https://github.com/generalaction/emdash
89•onecommit•6h ago•35 comments

I pitched a roller coaster to Disneyland at age 10 in 1978

https://wordglyph.xyz/one-piece-at-a-time
380•wordglyph•11h ago•147 comments

Hugging Face Skills

https://github.com/huggingface/skills
122•armcat•6h ago•36 comments

How we rebuilt Next.js with AI in one week

https://blog.cloudflare.com/vinext/
301•ghostwriternr•3h ago•88 comments

Optophone

https://en.wikipedia.org/wiki/Optophone
18•Hooke•4d ago•3 comments

Build Your Own Forth Interpreter

https://codingchallenges.fyi/challenges/challenge-forth/
43•AlexeyBrin•3d ago•12 comments

Looks like it is happening

https://www.math.columbia.edu/~woit/wordpress/?p=15500
126•jjgreen•2h ago•84 comments

IRS Tactics Against Meta Open a New Front in the Corporate Tax Fight

https://www.nytimes.com/2026/02/24/business/irs-meta-corporate-taxes.html
174•mitchbob•11h ago•190 comments

OpenAI, the US government and Persona built an identity surveillance machine

https://vmfunc.re/blog/persona/
407•rzk•5h ago•128 comments

The history of knocking on wood

https://resobscura.substack.com/p/neolithic-habits-machine-age-tools
7•benbreen•8h ago•0 comments

Pi – a minimal terminal coding harness

https://pi.dev
98•kristianpaul•2h ago•44 comments

Fed's Cook says AI triggering big changes, sees possible unemployment rise

https://www.reuters.com/business/feds-cook-says-ai-triggering-big-changes-sees-possible-short-ter...
21•geox•29m ago•4 comments

We installed a single turnstile to feel secure

https://idiallo.com/blog/installed-single-turnstile-for-security-theater
258•firefoxd•2d ago•116 comments

Verge (YC S15) Is Hiring a Director of Computational Biology and AI Scientists/Eng

https://jobs.ashbyhq.com/verge-genomics
1•alicexzhang•7h ago

Steel Bank Common Lisp

https://www.sbcl.org/
134•tosh•5h ago•43 comments

Dream Recorder AI – a portal to your subconscious

https://dreamrecorder.ai/
9•level87•1h ago•9 comments

Ask HN: Programmable Watches with WiFi?

12•dakiol•3d ago•5 comments

IDF killed Gaza aid workers at point blank range in 2025 massacre: Report

https://www.dropsitenews.com/p/israeli-soldiers-tel-sultan-gaza-red-crescent-civil-defense-massac...
1124•Qem•11h ago•410 comments

Stripe reportedly makes offer to acquire PayPal

https://www.cnbc.com/2026/02/24/paypal-stock-stripe-acquisition-report.html
38•nodesocket•1h ago•23 comments

We Are Changing Our Developer Productivity Experiment Design

https://metr.org/blog/2026-02-24-uplift-update/
26•ej88•4h ago•17 comments

Show HN: Tag Promptless on any GitHub PR/Issue to get updated user-facing docs

26•prithvi2206•6h ago•5 comments

Mercury 2: The fastest reasoning LLM, powered by diffusion

https://www.inceptionlabs.ai/blog/introducing-mercury-2
3•fittingopposite•1h ago•0 comments

Show HN: Chaos Monkey but for Audio Video Testing (WebRTC and UDP)

https://github.com/MdSadiqMd/AV-Chaos-Monkey
30•MdSadiqMd•1d ago•2 comments

The Missing Semester of Your CS Education – Revised for 2026

https://missing.csail.mit.edu/
374•anishathalye•1d ago•113 comments