frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

CompileBench: Can AI Compile 22-year-old Code?

https://quesma.com/blog/introducing-compilebench/
73•jakozaur•2h ago

Comments

stared•2h ago
Curious for the ultimate benchmark - can AI compile Doom an on arbitrary device?
flenserboy•1h ago
that, & how well does it cope with Perl?
johnisgood•1h ago
Claude is good enough at Perl with lots of hand-holding and reiterations, according to my experiences.
piotrgrabowski•1h ago
Author here.

So far in this benchmark we based the tasks on a couple of open-source projects (like curl, jq, GNU Coreutils).

Even on those "simple" projects we managed to make the tasks difficult - Claude Opus 4.1 was the only one to correctly cross-compile curl for arm64 (+ make it statically-linked) [1].

In the future we'd like to test it with projects like FFmpeg or chromium - those should be much more difficult.

[1] https://www.compilebench.com/curl-ssl-arm64-static/

OtherShrezzing•28m ago
For the _reviving 20 year old code_ type tasks, are the tested outcomes things we'd expect to be in the public domain? For example, in the way the 'SWEBenchVerified' tests are poisoned tests, because the LLMs are able to look up bug fixes in the project git repository.
jcranmer•10m ago
A long time ago, I did a project where I downloaded a year's worth of nightly builds for Thunderbird so that I could collect nightly code coverage information. Over the course of doing so, I discovered that there was one dependency (pango, I think?) such that no version could support the entire year's worth of source--the newer version didn't work with the older builds, and the older version didn't work with the newer builds.

Come to think of it, in terms of trying to get old code building, the CVS days of Firefox should be interesting... because the first command in that build step is "download the source code" and that CVS server isn't running anymore. And some of the components are downloaded from a CVS tag rather than trunk, and the converted CVS repositories I'm aware of all only converted the trunk and none of the branches or tags.

nl•1h ago
This is a really good benchmark. So much time is spent on these messy types of tasks and no one really likes doing it.

Now if it could fix React Native builds after package upgrades I'd be impressed...

bgwalter•1h ago
LGTM! I'm sure it comes with a correctness proof, too!

The newer blog posts appear to scan forums like this one for objections ("AI" does not work for legacy code bases) and then create custom "benchmarks" for their sales people to point to if they encounter these objections.

falcor84•1h ago
> Our toughest challenges include cross-compiling to Windows or ARM64 and resurrecting 22-year-old source code from 2003 on modern systems. Some agents needed 135 commands and 15 minutes just to produce a single working binary.

I found that "just" there to be so funny in terms of how far the goal posts moved over these last few years (as TFA does mention). I personally am certain that it would have taken me significantly longer than that to do it myself.

ACCount37•39m ago
15 minutes?

And here's me, after 4 straight days of wrangling an obscure cross-compilation toolchain to resurrect some ill-fated piece of software from year 2011 in a modern embedded environment.

Philpax•1h ago
Excellent benchmark. May I suggest a extension: "port any pre-uv Python ML codebase to uv so that it can actually be reliably reproduced"?
buildbot•1h ago
I’ve been doing this a lot! AI seems to really excel at setting up compiler boilerplate/minor modifications for new arch. I made a simple cpu information utility work on HP PA-RISC and Sparc64 :)
sehugg•49m ago
I have tried to get Claude to compile arbitrary C++ projects with Emscripten, and its track record is about as good as mine.
jclay•49m ago
the libs in the bench don’t really have an external deps. will be much more interesting to see the results with ffmpeg, Qt, etc. The original source releases from any repo here would also be great candidates: https://github.com/id-software
shallichange•39m ago
I hadn’t thought of that use case. Say for example you find 1990’s Clipper code and want to give it a try on a modern Linux. Thanks

Dear GitHub: no YAML anchors, please

https://blog.yossarian.net/2025/09/22/dear-github-no-yaml-anchors
88•woodruffw•1h ago•57 comments

Cloudflare: A New Internet Business Model

https://blog.cloudflare.com/cloudflare-2025-annual-founders-letter/
16•mmaia•24m ago•8 comments

A Simple Way to Measure Knots Has Come Unraveled

https://www.quantamagazine.org/a-simple-way-to-measure-knots-has-come-unraveled-20250922/
20•baruchel•49m ago•3 comments

Easy Forth

https://skilldrick.github.io/easyforth/
112•pkilgore•3h ago•46 comments

CompileBench: Can AI Compile 22-year-old Code?

https://quesma.com/blog/introducing-compilebench/
73•jakozaur•2h ago•15 comments

Cloudflare is sponsoring Ladybird and Omarchy

https://blog.cloudflare.com/supporting-the-future-of-the-open-web/
160•jgrahamc•2h ago•99 comments

PlanetScale announces PlanetScale for Postgres is GA

https://planetscale.com/blog/planetscale-for-postgres-is-generally-available
28•munns•27m ago•6 comments

What is algebraic about algebraic effects?

https://interjectedfuture.com/what-is-algebraic-about-algebraic-effects/
22•iamwil•1h ago•3 comments

Cap'n Web: a new RPC system for browsers and web servers

https://blog.cloudflare.com/capnweb-javascript-rpc-library/
45•jgrahamc•2h ago•6 comments

Kmart's use of facial recognition to tackle refund fraud unlawful

https://www.oaic.gov.au/news/media-centre/18-kmarts-use-of-facial-recognition-to-tackle-refund-fr...
172•Improvement•5h ago•128 comments

SGI demos from long ago in the browser via WASM

https://github.com/sgi-demos
164•yankcrime•7h ago•36 comments

How I, a beginner developer, read the tutorial you, a developer, wrote for me

https://anniemueller.com/posts/how-i-a-non-developer-read-the-tutorial-you-a-developer-wrote-for-...
652•wonger_•14h ago•314 comments

Beyond the Front Page: A Personal Guide to Hacker News

https://hsu.cy/2025/09/how-to-read-hn/
76•firexcy•6h ago•33 comments

Anti-*: The Things We Do but Not All the Way

https://blog.jim-nielsen.com/2025/my-antis/
4•gregwolanski•34m ago•0 comments

A Beautiful Maths Game

https://sinerider.com/
50•waonderer•2d ago•16 comments

What if we treated Postgres like SQLite?

https://www.maragu.dev/blog/what-if-we-treated-postgres-like-sqlite
13•markusw•2h ago•4 comments

You did this with an AI and you do not understand what you're doing here

https://hackerone.com/reports/3340109
737•redbell•7h ago•354 comments

M4.6 Earthquake – 2 km ESE of Berkeley, CA

https://earthquake.usgs.gov/earthquakes/eventpage/ew1758534970/executive
134•brian-armstrong•5h ago•75 comments

Biconnected components

https://emi-h.com/articles/bcc.html
33•emih•16h ago•7 comments

Privacy and Security Risks in the eSIM Ecosystem [pdf]

https://www.usenix.org/system/files/usenixsecurity25-motallebighomi.pdf
215•walterbell•11h ago•114 comments

Show HN: Software Freelancers Contract Template

https://sopimusgeneraattori.ohjelmistofriikit.fi/?lang=en
100•baobabKoodaa•8h ago•38 comments

The Counterclockwise Experiment

https://domofutu.substack.com/p/the-counterclockwise-experiment
49•domofutu•2d ago•17 comments

DeepSeek-v3.1-Terminus

https://api-docs.deepseek.com/news/news250922
67•meetpateltech•3h ago•15 comments

Why Local-First Apps Haven't Become Popular?

https://marcobambini.substack.com/p/why-local-first-apps-havent-become
107•marcobambini•2h ago•137 comments

The death rays that guard life

https://worksinprogress.co/issue/the-death-rays-that-guard-life/
33•ortegaygasset•4d ago•18 comments

We Politely Insist: Your LLM Must Learn the Persian Art of Taarof

https://arxiv.org/abs/2509.01035
120•chosenbeard•15h ago•69 comments

Why is Venus hell and Earth an Eden?

https://www.quantamagazine.org/why-is-venus-hell-and-earth-an-eden-20250915/
168•pseudolus•16h ago•282 comments

What if AMD FX had "real" cores? [video]

https://www.youtube.com/watch?v=Lb4FDtAwnqU
21•zdw•3d ago•16 comments

How can I influence others without manipulating them?

https://andiroberts.com/leadership-questions/how-to-influence-others-without-manipulating
187•kiyanwang•17h ago•181 comments

Simulating a Machine from the 80s

https://rmazur.io/blog/fahivets.html
64•roman-mazur•3d ago•10 comments