frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

From Rust to reality: The hidden journey of fetch_max

https://questdb.com/blog/rust-fetch-max-compiler-journey/
104•bluestreak•4h ago

Comments

IshKebab•3h ago
Yeah this comes from ARM and AXI, which has atomic max (and min, add, set, clear and xor). I assume ARM has all the corresponding instructions. RISC-V also has all of these in Zaamo.
yshui•3h ago
That's a cool find. I wonder if LLVM also does the other way around operation, where it pattern matches handwritten CAS loops and transform them into native ARM64 instructions.
jerrinot•2h ago
That's a very good question. A proper compiler engineer would know, but I will do my best to find something and report back.

Edit: I could not find any pass with a pattern matching to replace CAS loops. The closest thing I could find is this pass: https://github.com/llvm/llvm-project/blob/06fb26c3a4ede66755... I reckon one could write a similar pass to recognize CAS idioms, but its usefulness would be probably rather limited and not worth the effort/risks.

jerrinot•3h ago
Hi, author here. My superpower is spending unreasonable amounts of time researching things with no practical purpose. Occasionally I blog about it - as a warning to others.
Ethee•2h ago
It's these kinds of posts that I appreciate reading the most, so thank you for sharing!
owls-on-wires•2h ago
“…no practical purpose” Nonsense, I learned something about compilation today. Thank you for sharing.
trws•2h ago
I liked the article. I saw your PS that we added it to the working draft for c++26, we also made it part of OpenMP as of 5.0 I think. It’s sometimes a hardware atomic like on arm, but what made the case was that it’s common to implement it sub-optimally even on x86 or LL-SC architectures. Often the generic cas loop gets used, like in your lambda example, but it lacks an early cutout since you can ignore any input value that’s on the wrong side of the op by doing a cheap atomic read or just cutting out of the loop after the first failed CAS if the read back shows it can’t matter. Also can benefit from using slightly different memory orders than the default on architectures like ppc64. It’s a surprisingly useful op to support that way.

If this kind of thing floats your boat, you might be interested in the non-reading variants of these as well. Mostly for things like add, max, etc but some recent architectures actually offer alternate operations to skip the read-back. The paper calls them “atomic reduction operations” https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p31...

tux3•2h ago
This blog sent me into a memory models rabbit hole again. Each time I end up feeling like I'm finally starting to get it, only for a 6 line litmus test with 4 loads and 2 stores to send me crashing back down.

It makes me feel a little better reading about the history of memory models in CPUs. If this stuff wasn't intuitive to Intel either, I'm at least in good company in being confused (https://research.swtch.com/hwmm#path_to_x86-tso)

I actually knew about fetch_max from "implementing" the corresponding instruction (risc-v amomax), but I haven't done any of the fun parts yet since my soft-CPU still only has a single core.

orlp•1h ago
Aarch64 does indeed have a proper atomic max, but even on x86-64 you can get a wait-free atomic max as long as you only need to support integers up to 64. In that case you can simply do a `lock or` with 1 << i as your maximum. You can even support larger sizes by using multiple registers, e.g. four 64-bit registers for a u8 maximum.

In most cases it's even better to just store a maximum per thread separately and loop over all threads once to compute the current maximum if you really need it.

jerrinot•1h ago
That’s a neat trick, albeit with limited applicability given the very narrow range. Thanks for sharing!
MountainTheme12•27m ago
Only slightly related, but GPUs also have such instructions (exposed as InterlockedMax in HLSL and atomicMax in GLSL and CUDA).
minedwiz•21m ago
Did he get the job?
ShroudedNight•12m ago
Was this compiled at O0? The generated code looks unnecessarily long-winded - at the very least I would expect the match jump table to get culled to only the Relaxed implementation.

Baldur's Gate 3 Steam Deck – Native Version

https://larian.com/support/faqs/steam-deck-native-version_121
61•_JamesA_•1h ago•25 comments

Find SF parking cops

https://walzr.com/sf-parking/
550•alazsengul•7h ago•328 comments

MLB approves robot umpires for 2026 as part of challenge system

https://www.espn.com/mlb/story/_/id/46357017/mlb-approves-robot-umpires-2026-part-challenge-system
32•pseudolus•1h ago•14 comments

Libghostty is coming

https://mitchellh.com/writing/libghostty-is-coming
529•kingori•11h ago•161 comments

Qwen3-VL

https://qwen.ai/blog?id=99f0335c4ad9ff6153e517418d48535ab6d8afef&from=research.latest-advancement...
149•natrys•4h ago•42 comments

From Rust to reality: The hidden journey of fetch_max

https://questdb.com/blog/rust-fetch-max-compiler-journey/
104•bluestreak•4h ago•13 comments

Markov chains are the original language models

https://elijahpotter.dev/articles/markov_chains_are_the_original_language_models
253•chilipepperhott•4d ago•107 comments

The Top Programming Languages 2025

https://spectrum.ieee.org/top-programming-languages-2025
29•jnord•1h ago•15 comments

NYC Telecom Raid: What's Up with Those Weird SIM Banks?

https://tedium.co/2025/09/23/secret-service-raid-sim-bank-telecom-hardware/
77•coloneltcb•1h ago•27 comments

Getting AI to work in complex codebases

https://github.com/humanlayer/advanced-context-engineering-for-coding-agents/blob/main/ace-fca.md
228•dhorthy•11h ago•225 comments

A vibrator helped me debug a motorcycle brake light system

https://bikesafe.me/blogs/news/how-a-vibrator-helped-me-debug-a-motorcycle-brake-light-system
20•mygnu•3d ago•8 comments

Kitty – GPU based terminal emulator

https://sw.kovidgoyal.net/kitty/
56•andsoitis•3d ago•31 comments

Go has added Valgrind support

https://go-review.googlesource.com/c/go/+/674077
470•cirelli94•16h ago•121 comments

How to draw construction equipment for kids

https://alyssarosenberg.substack.com/p/how-to-draw-construction-equipment
85•holotrope•6h ago•38 comments

Launch HN: Strata (YC X25) – One MCP server for AI to handle thousands of tools

117•wirehack•10h ago•61 comments

Context Engineering for AI Agents: Lessons

https://manus.im/blog/Context-Engineering-for-AI-Agents-Lessons-from-Building-Manus
38•helloericsf•4h ago•3 comments

Apple A19 SoC die shot

https://chipwise.tech/our-portfolio/apple-a19-dieshot/
76•giuliomagnifico•6h ago•36 comments

Always Invite Anna

https://sharif.io/anna-alexei
623•walterbell•10h ago•65 comments

Podman Desktop celebrates 3M downloads

https://podman-desktop.io/blog/3-million
53•twelvenmonkeys•4h ago•9 comments

Periodic Table of Cognition

https://kk.org/thetechnium/the-periodic-table-of-cognition/
4•garspin•1h ago•0 comments

From MCP to shell: MCP auth flaws enable RCE in Claude Code, Gemini CLI and more

https://verialabs.com/blog/from-mcp-to-shell/
118•stuxf•10h ago•34 comments

Mesh: I tried Htmx, then ditched it

https://ajmoon.com/posts/mesh-i-tried-htmx-then-ditched-it
172•alex-moon•13h ago•124 comments

YouTube says it'll bring back creators banned for Covid and election content

https://www.businessinsider.com/youtube-reinstate-channels-banned-over-covid-content-policies-2025-9
221•delichon•5h ago•411 comments

Show HN: Ggc – A Git CLI tool written in Go with interactive UI

https://github.com/bmf-san/ggc/releases/tag/v6.0.0
20•bmf-san•3d ago•0 comments

Is life a form of computation?

https://thereader.mitpress.mit.edu/is-life-a-form-of-computation/
66•redeemed•4h ago•65 comments

consumed.today

https://consumed.today/
154•burkaman•6h ago•29 comments

Denmark wants to push through Chat Control

https://netzpolitik.org/2025/internes-protokoll-daenemark-will-chatkontrolle-durchdruecken/
210•Improvement•6h ago•105 comments

Shopify, pulling strings at Ruby Central, forces Bundler and RubyGems takeover

https://joel.drapper.me/p/rubygems-takeover/
439•bradgessler•10h ago•279 comments

Omittable – Solving the Ambiguity of Null

https://committing-crimes.com/articles/2025-09-16-null-and-absence/
5•TheWiggles•2d ago•1 comments

Sampling and structured outputs in LLMs

https://parthsareen.com/blog.html#sampling.md
201•SamLeBarbare•14h ago•85 comments