I have finally bitten the bullet and learned Rust in the last few months and ended up really liking it, but I have to admit that it's a bit lower level than I generally work in.
I have generally avoided locks by making very liberal use of Tokio channels, though that isn't for performance reasons or anything: I just find locks really hard to reason about for anything but extremely trivial usecases, and channels are a more natural abstraction for me.
I've never really considered what goes into these lock-free structures, but that might be one of my next "unemployment projects" after I finish my current one.
> Don't Communicate by Sharing Memory; Share Memory by Communicating
> In distributed systems there is no real shared state (imagine one machine in the USA another in Sweden) where is the shared state? In the middle of the Atlantic? — shared state breaks laws of physics. State changes are propagated at the speed of light — we always know how things were at a remote site not how they are now. What we know is what they last told us. If you make a software abstraction that ignores this fact you’ll be in trouble.
He wrote this to me in 2014, and it has really informed how I think about these things.
that's all well and good until you realize you are reimplementing a slow, buggy version of MESI in software.
Proper concurrency control is the key. Shared memory vs message passing is incidental and application specific.
Is it though? Aren't atomic load/store instructions the actual important thing. I know the type system ensures that `AtomicUsize` can only be accessed using atomic instructions but saying it's being watched by the CPU is inaccurate.
https://en.wikipedia.org/wiki/Load-link/store-conditional
It's not "24/7" but it is "watching" in some sense of the word. So not entirely unfair.
The edgy tones sound like from an LLM to me..
Other than that I take the other side. I’ve read (and subsequently never finished) dozens of programming books because they are so god awfully boring. This writing style, perhaps dialed back a little, helps keep my interest. I like the feel of a zine where it’s as technical as a professional write up but far less formal.
I often find learning through analogy useful anyway and the humor helps a lot too.
This has no unsafe code. It's all done with compare and swap. There is locking here, but it's down at the hardware level within the compare and swap instruction. This is cleaner and more portable than relying on cross-CPU ordering of operations.
[1] https://github.com/John-Nagle/rust-vulkan-bindless/blob/main...
How would this compare to the lock free abstractions that come with e.g. the java.concurrent package? It has a lot of useful primitives and data structures. I expect the memory overhead is probably worse for those.
Support for this is one of the big reason Java and the jvm has been a popular choice for companies building middleware and data processing frameworks for the last few decades. Exactly the kind of stuff that the author of this article is proposing you could build with this. Things like Kafka, Lucene, Spark, Hadoop, Flink, Beam, etc.
Indeed; normally we call it the system allocator.
A good system allocator will use per thread or per cpu free-lists so that it doesn't need to do CAS loops for every allocation though. At the very least will use hashed pools to reduce contention.
Of course if the mutex array is doing a linear scan to find the insertion point that would explain the difference but: a) I can't see the code for the alternative and b) there is no reason why the mutex variant can't use a free list.
Remember:
- Lock free doesn't automatically means faster (still it has other properties that might be desirable even if slower)
- Never trust a benchmark you didn't falsify yourself.
[1] when uncontended; when contended cache coherence cost will dominate over everything else, lock-free or not.
As the benchmark is very dependent on contention, it would give very different results if the the threads are scheduled serially as opposed to running truly concurrently (for example using a spin lock would be awful if running on a single core).
So again, you need to be very careful to understand what you are actually testing.
that benchmarking is something i should have added more alternatives to.
The article doesn't go into details but this is subtle way to mess up writing lock free data structures:
r3tr0•2d ago
I used humor and analogies in the article not just to be entertaining, but to make difficult concepts like memory ordering and atomics more approachable and memorable.
tombert•8h ago
Have you looked at CTries before? They're pretty interesting, and I think are probably the future of this space.
nmca•6h ago
bobbyraduloff•5h ago
my suggestion to OP: this was interesting material, ChatGPT made it had to read. use your own words to explain it. most people interested in this deeply technical content would rather read your prompt than the output.
r3tr0•33m ago
The core ideas, jokes, code, and analogies are 100% mine.
Human chaos. Machine polish.