I just finished coding the core version of this library called Cuttlefish written completely in rust. It’s a CRDT inspired framework that packs stuff like io_uring, SIMD, zero copy pipelines etc.. Here’s what it is:
So most distributed systems are strong consistency but the tradeoff is latency. Cuttlefish is a coordination-free state kernel that preserves invariants and constraints at the speed of your L1 cache.
Correctness here is defined by a property of algebra. So if your operations commute, you don’t need coordination. If they don’t, you know at admission time in nanoseconds, or at least it’s supposed to.
Running a full benchmark suite triggered the following results:
Full admission cycle: ~40ns Kernel admit: ~13 ns Causal clock dominance: ~700 ps Tiered hash verification: ~280 ns Durable admission: ~5.2 ns WAL hash: ~230 ns
On my CPU though (r5 7600x), I measure 40 ns full cycle including causality check, but I’m not sure of my benchmark setup because most of it was written by AI. How are other people measuring sub-100 ns rust code paths reliably? Repo: https://github.com/abokhalill/cuttlefish
verdverm•1h ago
It used counters on the CPU, something super basic like reading those registers into a var.
---
You can probably take the above to a coding agent or LLM and get what you need back.