Also, having sacrificed my own mental health to watch the disgustingly self-promoting hour-long video that announces this small git commit, I can confidently say that "Graviton doesn't have any performance counters" is one of the wrongest things I've heard in a long time.
Overall, I give it an F.
Anyway if you want to hide memory refresh latency, IBM zEnterprise is your platform. It completely hides refresh latency by steering loads to the non-refreshing bank, and it only costs half the space, not up to 92% of your space like this technique.
The clflush is there because the technique targets data that will miss the cache anyway. If your working set fits in L1, you don’t need this.
Also, AWS Graviton instances absolutely do not expose per-channel memory controller counter PMUs. That’s why you have to use timing-based channel discovery.
The IBM z-system is neat! But my technique will work on commodity hardware in userspace, and you can easily only sacrifice half the space if you accept 2-way instead of 8+ way hedging. It’s entirely up to you how many channel copies you want to use.
Your reply was quite rude, but I hope this is informative.
Not sure how this works for larger data structures, but my first thought was that this should be implemented as some microcode or instruction.
Most computation is not thaat jitter sensitive, perception is not really in the nano to microsecond scale, but maybe a cool gadget for like dtrace or interrupt handers etc.
OT: Tail Slayer. Not Tails Layer. My brain took longer to parse that than I’d have wanted.
shaicoleman•2h ago
* Video [2]
1. https://x.com/lauriewired/status/2041566601426956391 (https://xcancel.com/lauriewired/status/2041566601426956391)
2. https://www.youtube.com/watch?v=KKbgulTp3FE