- "Ex-spouse: I looked you up on a dating website, and your userID indicates it was created while you were at Tom's party where you swear nothing happened."
- "You say you are in XYZ timezone, but all your imageIDs (that are unique to the image upon creation) are timestamped at what would be 3am in your timezone)"
Granted, for individual messages that are near-real-time, or for transactions that need to be timestamped anyway, it's probably fine, but for user-account-creation or "evergreen" asset-creation, it could leak the time to a sufficiently curious individual (or an organized group that is doing data-trawling and cross-correlation)
For analysis reasons, you want to share this dataset (e.g. for diagnostics on the machine) but first must strip it of potentially identifying information.
The uuidv7 timestamp could be used to re-identify the data through correlation - "I know this person got an MRI on this day, there's only one record with a matching datestamp, thus I know it's their MRI."
Although I finished it, I never quite published it properly for some reason, probably partly because I shelved the projects where I had been going to use it (I might unshelve one of them next year).
Well, I might as well share it, because it’s quite relevant here and interesting:
https://temp.chrismorgan.info/2025-09-17-tesid/
My notes on its construction, pros and cons are fairly detailed.
Maybe I’ll go back and publish it properly next year.
(Ah, it’s fun reading through that document a bit again. A few things I’d need to update now, like the Hashids name, or in the UUID section how UUIDv7 is no longer a draft, and of sidenote 12 I moved to India and got married and so took a phone number ending in 65536, replacing my Australian 32768. :-) )
It’s lasted for three years of use and three years of disuse, and I hope to replace it with something utterly different (stylistically and technically) by the end of this year, though it may slip to next year. The replacement will be based on handwriting.
(I’m not a fan of handwriting fonts either. They’re never truly satisfying, though some with quite a few variants for each character get past the point of feeling transparently inauthentic. But when you can write and draw what you choose, where you choose, that’s liberating.)
I wanted to use it many times in project for non-iteratable IDs but never found it again.
And maybe I misunderstand how the hashing works, but it seems if you're looking things up by the hashed uuid, you're still going to want two columns anyway.
timestamp + readability
aabbdev•3h ago
How it works: the 48-bit timestamp is XOR-masked with a keyed SipHash-2-4 stream derived from the UUID’s random field. The random bits are preserved, the version flips between 7 (inside) and 4 (outside), and the RFC variant is kept. The mapping is injective: (ts, rand) → (encTS, rand). Decode is just encTS ⊕ mask, so round-trip is exact.
Security: SipHash is a PRF, so observing façades doesn’t leak the key. Wrong key = wrong timestamp. Rotation can be done with a key-ID outside the UUID.
Performance: one SipHash over 10 bytes + a couple of 48-bit loads/stores. Nanosecond overhead, header-only C11, no deps, allocation-free.
Tests: SipHash reference vectors, round-trip encode/decode, and version/variant invariants.
Curious to hear feedback!
the_mitsuhiko•1h ago
1. You implicitly take away someone else's hypothetical benefit of leveraging UUID v7, which is disappointing for any consumer of your API.
2. By storing the UUIDs differently on your API service from internally, you're going to make your life just a tiny bit harder because now you have to go through this indirection of conversion, and I'm not sure if this is worth it.
whatevaa•1h ago
the_mitsuhiko•53m ago
hnav•26m ago
the_mitsuhiko•17m ago
aabbdev•1h ago
kevlened•1h ago
Usually if you see an id in your http logs you can simply search your database for that id. The v4 to v7 indirection creates a small inconvenience.
The mismatch may be resolved if this was available as a fully transparent database optimization.
thunderfork•1h ago
1. Not leaking timestamp data (security/regulations)
2. Having easily time-sortable primary keys (DB performance/etc.)
If you don't have both of these needs, the tool is an unnecessary indirection, as you've identified in (2).
However, where you do have both needs, some indirection is necessary. Whether this is the correct one is a different question.
Similarly, if you _must not_ leak timestamps for some real-world reason, (1) is an intrinsic requirement, consumers be damned.
the_mitsuhiko•52m ago
JimDabell•34m ago
JimDabell•1h ago
UUIDs are often generated client-side. Am I right in thinking that this isn’t possible with this approach? Even if you let clients give you UUIDs and they gave them back the masked versions, wouldn't you be vulnerable to a client providing two UUIDs with different ts and the same rand? So this is only designed for when you are generating the UUIDv7s yourself?
move-on-by•27m ago
Of course, UUIDv4 on the client side is not without risk either- needing to validate uniqueness and not re-use of some other ID. For the UUIDv7 on client side- you could add some sanity validation- but really I think it’s best avoided.