extend google.protobuf.MessageOptions {
optional uint64 domain_separator = 1234;
}
message TreeRoot {
option (domain_separator) = 4567;
...
}#1 You sign a blob and you don't touch it before verifying the signature (aka "The Cryptographic Doom Principle") #2 Signatures are bound to a context which is _not_ transmitted but used for deriving the key or mixed into the MAC or what have you. This is called the Horton principle. It ensures that signer/verifier must cryptographically agree on which context the message is intended for. You essentially cannot implement this incorrectly because if you do, all signatures will fail to verify.
The article actually proposes to violate principle #2 (by embedding some magic numbers into the protocol headers and presuming that someone will check them), which is an incorrect design and will result in bad things if history is any indication.
Principles #1 and #2 are well-established cryptographic design principles for just a handful of decades each.
It's used exactly as you say: a shared context used as input for the signature that is not transmitted.
> it makes a concatenation of the domain separator (@0x92880d38b74de9fb) and the serialization of the object, and then feeds the byte stream into the signing primitive. Similarly, verification of an object verifies this same reconstructed concatenation against the supplied signature.
> Note that the domain separator does not appear in the eventual serialization (which would waste bytes), since both signer and receiver agree on it via this shared protocol specification. Encrypt, HMAC, and hash work the same way
Though, in fairness, that is /kind of/ like transmitting it---in the sense that it impacts the message that is returned. It's more akin to sending a checksum of the magic number, rather than the magic number itself. But conceptually, that is just an optimization. The desire is for the client to ensure the server is using the same magic number, we just so happen to be able to overload the signature to encode this data without increasing the message size.
> Note that the domain separator does not appear in the eventual serialization (which would waste bytes), since both signer and receiver agree on it via this shared protocol specification.
But saying it's about wasting bytes is a little confusing, as you observe that isn't really the point.
Domain separation happens in the input to the hash function, not on the wire. Because what arrives off the wire is UNTRUSTED input.
It seems like Horton Principle just says "all messages have ≤1 meaning". If a message signed by key X must be parsed using the canonical encoding, then aren't we done?
There is still room for danger. e.g., You send `GetUserPermissionLevel(user:"Alice")` and server responds with `UserNicknameIs(user:"Alice", value:"admin")`. If you fail to check the message type, you might get tricked.
Maybe it's nice if it was mathematically impossible to validate the signature without first providing your assumptions. e.g., The subroutine to validate message `UserNicknameIs(user:"Alice", value:"admin")` requires `ServerKey × ExpectedMessageType`. But "ExpectedMessageType" isn't the only assumption being made, is it?
You might get back `UserPermissionLevel(user:"Bob", value:"admin")` or `UserPermissionLevel(user:"Alice", value:"admin", timestamp:"<3d old>")`. Will we expect the MAC to somehow accept a "user" value? And then what do we do about "timestamp"?
Maybe we implement `ClientMessage(msgUuid: UUID, requestData:...)` and `ServerResponse(clientMsgUuid: UUID, responseData:...)`, but now the UUID is a secret, vulnerable to MITM attack unless data is encrypted.
It seems like you simply must write validation code to ensure that you don't misinterpret the message that is signed. There simply isn't any magic bullet. Having multiple interpretations for a sequence of bytes is a non-starter (addressed in the post). But once you have a single interpretation for a sequence of bytes, isn't it up to the developer to define a schema + validation logic that supports their use case? Maybe there are good off-the-shelf patterns, but--again--no magic bullets?
(edit on where the OIDs can be, and added another CVE)
This is another example where you would think that "who it's for" is something the sender would sign but nope!
Crypto is hard. Do it right. Get help from your tools. 'Nuff said.
Jeeze, I'm getting too old for this crap.
It doesn't matter if I sign the word "yes", if you don't know what question is being asked. The signature needs to included the necessary context for the signature to be meaningful.
Lots of ways of doing that, and you definitely need to be thoughtful about redundant data and storage overhead, but the concept isn't tricky.
Retr0id•1h ago
Tangentially, depending on what your input and data model look like, canonicalisation takes O(nlogn) time (i.e. the cost of sorting your fields).
Here I describe an alternative approach that produces deterministic hashes without a distinct canonicalization step, using multiset hashing: https://www.da.vidbuchanan.co.uk/blog/signing-json.html
majormajor•1h ago
What's over my head possibly, from skimming it, about your multiset hashing is how it avoids the "these payloads have the same shape, so one could be re-sent as the other" issue? It seems like a solution to a different problem?
Retr0id•32m ago
(I realise my comment reads a bit unclearly, it's basically two separate comments, split after the first paragraph)