As someone that uses pointer tagging, I must point out that this article is insufficiently defensive.
I've done my own exploration of what I can get away with across 64-bit x86 and ARM in this regard. It has been a while but the maximum number of bits that are reliably taggable across all environments and use cases that I have been able to determine is six. Can you get away with more? Probably yes, but there are identifiable environments where it will explode if you do so. That may not apply to your use case.
Reliable pointer tagging is not trivial.
forrestthewoods•1h ago
Can you share details? What modern platforms/environments does this not work on? Are you saying the intersection of available bits on all platforms is just 6? Or are there platforms that actually use 58 bits?
Would be great to hear some actionable details.
nervoir•36m ago
If you include ARM then PAC and MTE will consume a few of those precious bits. Don’t think any platforms use PAC for pointers to allocated objects though unless they’re determined to be exceptionally important like creds structure pointers in task structures in the kernel.
jandrewrogers•1m ago
It wasn’t anything clever. A couple years ago I did a dive into x86 and ARM literature to determine what bits of a pointer were in use in various environments or were on a roadmap to be used in the future. To be honest, it was more bits than I was expecting.
Note also that this is the intersection of bits that are available on both ARM and x86. If you want it to be portable, you need both architectures. Just because ARM64 doesn’t use a bit doesn’t mean that x86 doesn’t and vice versa.
Both x86 and ARM have proposed standards for pointer tagging in the high bits. However, those bits don’t perfectly overlap. Also, some platforms don’t fully conform to this reservation of high bits for pointer tagging, so there is a backward compatibility issue.
Across all of that, I found six high bits that were guaranteed to be safe for all current and future platforms. In practice you can probably use more but there is a portability risk.
sema4hacker•1h ago
In the 70's when memory was always small and expensive, I had to keep things as packed and tight as possible. But now memory is so huge and cheap that it's been a long time since I had to worry about things like packing bits, which is incredibly bug-prone anyway.
dh2022•1h ago
A benefit for packing pointers is when the data needed is already packed-this will avoid a pointer reference.
gblargg•1h ago
What's old is new again. The original 68000 processor only had a 24-bit physical address bus, so the MacOS used the upper 8 bits for tag information, and didn't even need to clear it when accessing. Once they started using CPUs and more RAM that needed these upper bits, they had to make "32-bit clean" versions of programs.
I wonder whether you could use the MMU to ignore these upper bits, by mapping each combination of bits to the same address as with them clear.
mlhpdx•23m ago
Brings back memories. I cut my teeth in C++ working on a system that used pointer tagging and transactional memory. Good times. That mind mending experience perhaps made my career.
fooker•21m ago
This sort of stuff is starting to have hardware support now, so you no longer have to perform this wizardry.
jandrewrogers•1h ago
I've done my own exploration of what I can get away with across 64-bit x86 and ARM in this regard. It has been a while but the maximum number of bits that are reliably taggable across all environments and use cases that I have been able to determine is six. Can you get away with more? Probably yes, but there are identifiable environments where it will explode if you do so. That may not apply to your use case.
Reliable pointer tagging is not trivial.
forrestthewoods•1h ago
Would be great to hear some actionable details.
nervoir•36m ago
jandrewrogers•1m ago
Note also that this is the intersection of bits that are available on both ARM and x86. If you want it to be portable, you need both architectures. Just because ARM64 doesn’t use a bit doesn’t mean that x86 doesn’t and vice versa.
Both x86 and ARM have proposed standards for pointer tagging in the high bits. However, those bits don’t perfectly overlap. Also, some platforms don’t fully conform to this reservation of high bits for pointer tagging, so there is a backward compatibility issue.
Across all of that, I found six high bits that were guaranteed to be safe for all current and future platforms. In practice you can probably use more but there is a portability risk.