frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
624•klaussilveira•12h ago•182 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
926•xnx•18h ago•548 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
32•helloplanets•4d ago•24 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
109•matheusalmeida•1d ago•27 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
9•kaonwarb•3d ago•7 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
40•videotopia•4d ago•1 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
219•isitcontent•13h ago•25 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
210•dmpetrov•13h ago•103 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
322•vecti•15h ago•143 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
370•ostacke•18h ago•94 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
358•aktau•19h ago•181 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
477•todsacerdoti•20h ago•232 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
272•eljojo•15h ago•160 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
402•lstoll•19h ago•271 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
85•quibono•4d ago•20 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
14•jesperordrup•2h ago•6 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
25•romes•4d ago•3 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
56•kmm•5d ago•3 comments

Start all of your commands with a comma

https://rhodesmill.org/brandon/2009/commands-with-comma/
3•theblazehen•2d ago•0 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
12•bikenaga•3d ago•2 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
244•i5heu•15h ago•188 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
52•gfortaine•10h ago•21 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
140•vmatsiiako•17h ago•63 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
280•surprisetalk•3d ago•37 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1058•cdrnsf•22h ago•433 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
132•SerCe•8h ago•117 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
70•phreda4•12h ago•14 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
28•gmays•8h ago•11 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
176•limoce•3d ago•96 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
63•rescrv•20h ago•22 comments
Open in hackernews

Advent of Compiler Optimisations 2025

https://xania.org/202511/advent-of-compiler-optimisation
384•vismit2000•2mo ago

Comments

ktallett•2mo ago
This is really cool. Congrats on the quality of the work!
filosofo_rancio•2mo ago
Thanks for sharing, I've always found optimizing a really interesting field, I will keep a close eye!
squater•2mo ago
You can never have too much Godbolt!
bspammer•2mo ago
I really appreciate that despite being an obvious domain expert, he’s starting with the simple stuff and not jumping straight into crazy obscure parts of the x86 instruction set
adev_•2mo ago
Matt Godbolt is an absolute gem for the C & C++ community.

Many thanks to him for that.

Between that and compiler explorer, it is fair to say he made the world a better place for many of us, developers.

cyberax•2mo ago
Wait?!? Godbolt is actually a real person!?!?
ubj•2mo ago
This is apparently such a common misunderstanding that it was put at the bottom of the C++ iceberg:

https://victorpoughon.github.io/cppiceberg/

cyberax•2mo ago
I used godbolt.org dozens of times, and I never bothered to look at "about".

D'Oh.

Sponsoring him on Github right now...

mattgodbolt•2mo ago
I _think_ so, but this could all be some kind of simulation, I guess? :)
alfanick•2mo ago
Is there a PDF somewhere? I'm not really able to follow YT videos.
philipportner•2mo ago
There's a link to the AoCO2025 tag for his blog posts in the op.
alberth•2mo ago
After 25-years of software development, I still wonder whether I’m using the best possible compiler flags.
cogman10•2mo ago
What I've learned is that the fewer flags is the best path for any long lived project.

-O2 is basically all you usually need. As you update your compiler, it'll end up tweaking exactly what that general optimization does based on what they know today.

Because that's the thing about these flags, you'll generally set them once at the beginning of a project. Compiler authors will reevaluate them way more than you will.

Also, a trap I've observed is setting flags based on bad benchmarks. This applies more to the JVM than a C++ compiler, but never the less, a system's current state is somewhat random. 1->2% fluctuations in performance for even the same app is normal. A lot of people won't realize that and ultimately add flags based on those fluctuations.

But further, how code is currently layed out can affect performance. You may see a speed boost not because you tweaked the loop unrolling variable, but rather your tweak may have relocated a hot path to be slightly more cache friendly. A change in the code structure can eliminate that benefit.

201984•2mo ago
What's your reason for -O2 over -O3?
cogman10•2mo ago
Historically, -O3 has been a bit less stable (producing incorrect code) and more experimental (doesn't always make things faster).

Flags from -O3 often flow down into -O2 as they are proven generally beneficial.

That said, I don't think -O3 has the problems it once did.

201984•2mo ago
Thanks
sgerenser•2mo ago
-O3 gained a reputation of being more likely to "break" code, but in reality it was almost always "breaking" code that was invalid to start with (invoked undefined behavior). The problem is C and C++ have so many UB edge cases that a large volume of existing code may invoke UB in certain situations. So -O2 thus had a reputation of being more reliable. If you're sure your code doesn't invoke undefined behavior, though, then -O3 should be fine on a modern compiler.
drob518•2mo ago
Exactly. A lot of people didn’t understand the contract between the programmer and the compiler that is required to use -O3.
MaxBarraclough•2mo ago
That's a little vague, I'd put that more pointedly: they don't understand how the C and C++ languages are defined, have a poor grasp of undefined behaviour in particular, and mistakenly believe their defective code to be correct.

Of course, even with a solid grasp of the language(s), it's still by no means easy to write correct C or C++ code, but if your plan it to go with this seems to work, you're setting yourself up for trouble.

uecker•2mo ago
Oh, there are also plenty of bugs. And Clang still does not implement the aliasing model of C. For C, I would definitely recommend -O2 -fno-strict-aliasing
afdbcreid•2mo ago
Indeed, e.g. Rust by default (release builds) use -O3.
superxpro12•2mo ago
Don't forget about -Oz!
wavemode•2mo ago
You have to profile for your specific use case. Some programs run slower under O3 because it inlines/unrolls more aggressively, increasing code size (which can be cache-unfriendly).
grogers•2mo ago
Yeah, -O3 generally performs well in small benchmarks because of aggressive loop unrolling and inlining. But in large programs that face icache pressure, it can end up being slower. Sometimes -Os is even better for the same reason, but -O2 is usually a better default.
bluGill•2mo ago
Most people use -O2 and so if you use -O3 you risk some bug in the optimizer that nobody else noticed yet. -O2 is less likely to have problems.

In my experience a team of 200 developers will see 1 compiler bug affect them every 10 years. This isn't scientific, but it is a good rule of thumb and may put the above in perspective.

macintux•2mo ago
Would you say that bug estimate is when using -O2 or -O3?
bluGill•2mo ago
The estimate includes visual studio, and other compilers that are not open source for whatever optimization options we were using at the time. As such your question doesn't make sense (not that it is bad, but it doesn't make sense).

In the case of open source compilers the bug was generally fixed upstream and we just needed to get on a newer release.

nickelpro•2mo ago
People keep saying "O3 has bugs," but that's not true. At least no more bugs than O2. It did and does more aggressively expose UB code, but that isn't why people avoid O3.

You generally avoid O3 because it's slower. Slower to compile, and slower to run. Aggressively unrolling loops and larger inlining windows bloat code size to the degree it impacts icache.

The optimization levels aren't "how fast do you want to code to go", they're "how aggressive do you want the optimizer to be." The most aggressive optimizations are largely unproven and left in O3 until they are generally useful, at which point they move to O2.

SubjectToChange•2mo ago
More aggressive optimization is necessarily going to be more error prone. In particular, the fact that -O3 is "the path less traveled" means that a higher number of latent bugs exist. That said, if code breaks under -O3, then either it needs to be fixed or a bug report needs to be filed.
uecker•2mo ago
I would say there is a fair share of cases where programmers were told it is UB when it actually was a compiler bug - or non-conformance.
saagarjha•2mo ago
That share is a vanishingly small fraction of cases.
uecker•2mo ago
I am not sure. I saw quite a few of these bugs where programmers were told it is UB but it isn't.

For example, people showed me

  extern void g(int x);

  int f(int a, int b)
  {
    g(b ? 42 : 43);
    return a / b;
  }
as an example on how compilers exploit "time-travelling" UB to optimize code, but it is just a compiler bug that got fixed once I reported it:

https://developercommunity.visualstudio.com/t/Invalid-optimi...

Other compilers have similar issues.

nickelpro•2mo ago
You're an expert, you're overestimating the competence of the median programmer.

That's a great bug you found, and of course it is a compiler bug, not UB.

99.9% of the bugs I've dealt with of this sort were just pointer aliasing. Or just use-after-free. Or just buffer overruns.

The median programmer, especially in the good ol' days, wrote UB code about once every 6-10 hours.

uecker•2mo ago
Sure. All I am saying is that there are still plenty of compiler bugs related to optimization, which is reason enough for me to recommend being careful with optimization in contexts where correctness is important.
saagarjha•2mo ago
I agree that compilers have issues and that you have clearly run into some of them. I disagree with whether they are are more common than writing UB.
uecker•2mo ago
Oh, I didn't meant to imply that there are more common, just that they are common enough to be careful with optimizations.
saagarjha•2mo ago
Sure, I guess? In my experience I turn on the optimizer mostly without fear because I know that if, in the rare case I need to track down an optimizer bug, it would look the same as my process for identifying any other sort of crazy bug and in this case it will at least have a straightforward resolution.
o11c•2mo ago
Compiler speed matters. I will confess to not as much practical knowledge of -O3, but -O2 is usually reasonable fast to compile.

For cases where -O2 is too slow to compile, dropping a single nasty TU down to -O1 is often beneficial. -O0 is usually not useful - while faster for tiny TUs, -O1 is still pretty fast for them, and for anything larger, the increased binary size bloat of -O0 is likely to kill your link time compared to -O1's slimness.

Also debuggability matters. GCC's `-O2` is quite debuggable once you learn how to work past the possibility of hitting an <optimized out> (going up a frame or dereferencing a casted register is often all you need); this is unlike Clang, which every time I check still gives up entirely.

The real argument is -O1 vs -O2 (since -O1 is a major improvement over -O0 and -O3 is a negligible improvement over -O2) ... I suppose originally I defaulted to -O2 because that's what's generally used by distributions, which compile rarely but run the code often. This differs from development ... but does mean you're staying on the best-tested path (hitting an ICE is pretty common as it is); also, defaulting to -O2 means you know when one of your TUs hits the nasty slowness.

While mostly obsolete now, I have also heard of cases where 32-bit x86 inline asm has difficulty fulfilling constraints under register pressure at low optimization levels.

alberth•2mo ago
Doesn't -O2 still exclude any CPU features from the past ~15 years (like AVX).

If you know the architecture and oldest CPU model, we're better served with added a bunch more flags, no?

I wish I could compile my server code to target CPU released on/after a particular date like:

  -O2 -cpu-newer-than=2019
SubjectToChange•2mo ago
A CPU produced after a certain date is not guaranteed to have the every ISA extension, e.g. SVE for Arm chips. Hence things like the microarchitecure levels for x86-64.
cogman10•2mo ago
For x86 it's a pretty good guarantee.
teo_zero•2mo ago
I don't understand if your comment is ironic. Intel is notorious for equipping different processors produced in the same period with different features. Sometimes even among different cores on the same chip. Sometimes later products have less features enabled (see e.g. AVX512 for Alder Lake).
cogman10•2mo ago
It's not an -O2 thing. Rather it's a -march thing.

-O2 in gcc has vectorization flags set which will use avx if the target CPU supports it. It is less aggressive on vectorization than -O3.

singron•2mo ago
You can use x86_64-v2 or x86_64-v3. Dates are tricky since cpu features aren't included on all SKUs from all manufacturers on a certain date.
tmtvl•2mo ago
I'd say -O2 -march=native -mtune=native is good enough, you get (some) AVX without the O3 weirdness.
pedrocr•2mo ago
That's great if you're compiling for use on the same machine or those exactly like it. If you're compiling binaries for wider distribution it will generate code that some machines can't run and won't take advantage of features in others.

To be able to support multiple arch levels in the same binary I think you still need to do manual work of annotating specific functions where several versions should be generated and dispatched at runtime.

vlovich123•2mo ago
You should at a minimum add flags to enable dead object collection (-fdata-sections and -ffunction-sections for compilation and -Wl,--gc-sections for the linker).
johnthescott•2mo ago
40 years latter i still have nightmares of long sessions debuging lattice c.
NooneAtAll3•2mo ago
I don't understand

where is the problem to be solved?

eapriv•2mo ago
The problem is “to add two numbers”. The meta-problem is “to learn how computers work”.
azundo•2mo ago
I think they're expecting a daily problem set like Advent of Code. This is not a set of problems to solve, it's a series with one release per day in December, similar to an Advent calendar.
ketanmaheshwari•2mo ago
I am personally interested in the code amalgamation technique that SQLite uses[0]. It seems like a free 5-10% performance improvement as is claimed by SQLite folks. Be nice if he addresses it some in one of the sessions.

[0] https://sqlite.org/amalgamation.html

theresistor•2mo ago
This is a pretty standard topic, and not really a compiler optimization. It's usually called a unity build.

[0] https://en.wikipedia.org/wiki/Unity_build

nickelpro•2mo ago
Unity builds have been largely supplanted by LTO. They still have uses for build time improvements in one-off builds, as LTO on a non-incremental build is usually slower than the equivalent unity build.
Sponge5•2mo ago
At my company, we have not seen any performance benefits from LTO on a GCC cross-compiled Qt application.

GCC version: 11.3 target: Cortex-A9 Qt version: 5.15

I think we tested single core and quad core, also possibly a newer GCC version, but I'm not sure. Just wanted to add my two cents.

o11c•2mo ago
I would expect a little benefit from devirt (but maybe in-TU optimizations are getting that already?), but if a program is pessimized enough, LTO's improvements won't be measurable.

And programs full of pointer-chasing are quite pessimized; highly-OO code is a common example, which includes almost all GUIs, even in C++.

gpderetta•2mo ago
Do you link against a version of the Qt library that provides IR objects?

In any case even with whole program optimization, O would expect that effectively devirtualizing an heavily object oriented application to be very hard.

euroderf•2mo ago
For those of you playing at home, LTO is link-time optimization.
calibas•2mo ago
Advent of Computer Science Advent Calendars, Day 2
drob518•2mo ago
Seems we’ve reached that point.
bkallus•2mo ago
I hope he ends up covering integer division by constants. The chapter on this in Hacker's Delight is really good but a little dense for casual readers.
atgreen•2mo ago
I'm looking forward to the remaining posts. The first thing I did this AM was teach SBCL how to optimize `(+ base (* index scale))` and `(+ base (ash index n))` patterns into single LEA instructions based on the day 2 learnings.
adamgordonbell•2mo ago
Matt is amazing. After checking out his compiler optimizations, maybe check out the recent interview I did with him.

    What I’ve come to believe is this: you should work at a level of abstraction you’re comfortable with, but you should also understand the layer beneath it.

    If you’re a C programmer, you should have some idea of how the C runtime works, and how it interacts with the operating system. You don’t need every detail, but you need enough to know what’s going on when something breaks. Because one day printf won’t work, and if the layer below is a total mystery, you won’t even know where to start looking.

    So: know one layer well, have working knowledge of the layer under it, and, most importantly, be aware of the shape of the layer below that.
https://corecursive.com/godbolt-rule-matt-godbolt/

Also this article in acmqueue by Matt is not new at all, but super great introduction to these types of optimizations.

https://queue.acm.org/detail.cfm?id=3372264

Insanity•2mo ago
The “understand one layer below where you work” is something my professors at uni told us 10+ years ago. Not sure where that originated from, but I really think that benefited me in my career. I.e understanding the JVM when dealing with Java helped optimize code in a relatively heavyweight medical software package.

And also, it’s just fun to understand the lower layers.

kaladin-jasnah•2mo ago
https://cacm.acm.org/research/always-measure-one-level-deepe... This has been a classic repeat in my grad classes.
mattgodbolt•2mo ago
Awww thanks again Adam :blush:
rramadass•2mo ago
My standard question to all Experts ;-)

What are some articles/books/videos that you would recommend to go from beginner-to-expert in your domain ?

cui•2mo ago
I actually wrote a blog post about it and checked lower level abstractions for web developers: https://yncui.com/post/lower_level_abstractions_for_web_deve...