frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Signing data structures the wrong way

https://blog.foks.pub/posts/domain-separation-in-idl/
43•malgorithms•1h ago

Comments

Retr0id•1h ago
Putting domain separators in the IDL is interesting but you can also avoid the problem by putting the domain separators in-band (e.g. in some kind of "type" field that is always present).

Tangentially, depending on what your input and data model look like, canonicalisation takes O(nlogn) time (i.e. the cost of sorting your fields).

Here I describe an alternative approach that produces deterministic hashes without a distinct canonicalization step, using multiset hashing: https://www.da.vidbuchanan.co.uk/blog/signing-json.html

majormajor•1h ago
I think a lot of people assume that the "name" of the type, for protos, will be preserved somewhere in the output such that a TreeRoot couldn't be re-used as a KeyRevoke. It makes sense that it isn't - you generally don't want to send that name every time - but it's non-obvious to people with a object-oriented-language background who just think "ah, different types are obviously different types." The serialization cost objection is generally what I've often seen against in-bound type fields and such, as well, so having a unique identifier that gets used just for signature computation is clever.

What's over my head possibly, from skimming it, about your multiset hashing is how it avoids the "these payloads have the same shape, so one could be re-sent as the other" issue? It seems like a solution to a different problem?

Retr0id•32m ago
Multiset hashing is not related to the domain separation problem, but it is related to the broader "signing data structures" problem.

(I realise my comment reads a bit unclearly, it's basically two separate comments, split after the first paragraph)

tantalor•1h ago
Since the example was given in proto, I'll suggest a solution in proto: add a message option.

  extend google.protobuf.MessageOptions {
    optional uint64 domain_separator = 1234;
  }

  message TreeRoot {
    option (domain_separator) = 4567;
    ...
  }
formerly_proven•1h ago
This article claims that these are somewhat open questions, but they're not and have not been for a long time.

#1 You sign a blob and you don't touch it before verifying the signature (aka "The Cryptographic Doom Principle") #2 Signatures are bound to a context which is _not_ transmitted but used for deriving the key or mixed into the MAC or what have you. This is called the Horton principle. It ensures that signer/verifier must cryptographically agree on which context the message is intended for. You essentially cannot implement this incorrectly because if you do, all signatures will fail to verify.

The article actually proposes to violate principle #2 (by embedding some magic numbers into the protocol headers and presuming that someone will check them), which is an incorrect design and will result in bad things if history is any indication.

Principles #1 and #2 are well-established cryptographic design principles for just a handful of decades each.

ahtihn•1h ago
Maybe I'm misunderstanding the article but I'm fairly sure the magic number is not transmitted.

It's used exactly as you say: a shared context used as input for the signature that is not transmitted.

lokar•1h ago
No, I'm pretty sure they are saying you need to transmit it
nightpool•51m ago
No, they propose just concatenating it with the data received from the network

> it makes a concatenation of the domain separator (@0x92880d38b74de9fb) and the serialization of the object, and then feeds the byte stream into the signing primitive. Similarly, verification of an object verifies this same reconstructed concatenation against the supplied signature.

> Note that the domain separator does not appear in the eventual serialization (which would waste bytes), since both signer and receiver agree on it via this shared protocol specification. Encrypt, HMAC, and hash work the same way

lokar•26m ago
Oh, it's just in the hash input. So if you don't use the right ID when you check the hash, it fails.
tennysont•7m ago
You are, of course, right. And this distinction is important for this chain of comments.

Though, in fairness, that is /kind of/ like transmitting it---in the sense that it impacts the message that is returned. It's more akin to sending a checksum of the magic number, rather than the magic number itself. But conceptually, that is just an optimization. The desire is for the client to ensure the server is using the same magic number, we just so happen to be able to overload the signature to encode this data without increasing the message size.

jcalvinowens•49m ago
I think not:

> Note that the domain separator does not appear in the eventual serialization (which would waste bytes), since both signer and receiver agree on it via this shared protocol specification.

But saying it's about wasting bytes is a little confusing, as you observe that isn't really the point.

jeffrallen•25m ago
It is definitely not transmitted.

Domain separation happens in the input to the hash function, not on the wire. Because what arrives off the wire is UNTRUSTED input.

lokar•59m ago
What if (and this is perhaps to big an if), you only ever serialize and de-serialize with code generated from the IDL, which always checks the magic numbers (returning a typed object(?
jeffrallen•22m ago
It's a big if because the threat model normally includes "bad guys can forge messages". Which means that the input is untrusted and you want to generate your own domain separation bytes for the hash function, not let your attacker choose them.
Muromec•37m ago
The article proposes a way to agree on context out of band and enforce it with idl. This seems to be an implementation of the principle you mention
tennysont•17m ago
Hmmmm. I agree that an ad-hoc implementation with protobufs can go wrong. But presumably, 1 canonical encoding for the private key constitutes the Horton principle?

It seems like Horton Principle just says "all messages have ≤1 meaning". If a message signed by key X must be parsed using the canonical encoding, then aren't we done?

There is still room for danger. e.g., You send `GetUserPermissionLevel(user:"Alice")` and server responds with `UserNicknameIs(user:"Alice", value:"admin")`. If you fail to check the message type, you might get tricked.

Maybe it's nice if it was mathematically impossible to validate the signature without first providing your assumptions. e.g., The subroutine to validate message `UserNicknameIs(user:"Alice", value:"admin")` requires `ServerKey × ExpectedMessageType`. But "ExpectedMessageType" isn't the only assumption being made, is it?

You might get back `UserPermissionLevel(user:"Bob", value:"admin")` or `UserPermissionLevel(user:"Alice", value:"admin", timestamp:"<3d old>")`. Will we expect the MAC to somehow accept a "user" value? And then what do we do about "timestamp"?

Maybe we implement `ClientMessage(msgUuid: UUID, requestData:...)` and `ServerResponse(clientMsgUuid: UUID, responseData:...)`, but now the UUID is a secret, vulnerable to MITM attack unless data is encrypted.

It seems like you simply must write validation code to ensure that you don't misinterpret the message that is signed. There simply isn't any magic bullet. Having multiple interpretations for a sequence of bytes is a non-starter (addressed in the post). But once you have a single interpretation for a sequence of bytes, isn't it up to the developer to define a schema + validation logic that supports their use case? Maybe there are good off-the-shelf patterns, but--again--no magic bullets?

Muromec•40m ago
So another lesson had been relearned from asn.1. I'm proud of working in this industry again! Next we will figure out to always put versions into the data too
jbmsf•39m ago
That was my first thought as well.
maxtaco•24m ago
I would say two problems with the asn.1 approach are: (1) it seems like too much cognitive overload for the OIDs to have semantic meaning, and it invites accidental reuse; I think it matters way more that the OIDs are unique, which randomness gets you without much effort; and (2) the OIDs aren't always serialized first, they are allowed to be inside the message, and there are failures that have resulted (https://nvd.nist.gov/vuln/detail/cve-2022-24771, https://nvd.nist.gov/vuln/detail/CVE-2025-12816)

(edit on where the OIDs can be, and added another CVE)

logicallee•40m ago
along the same lines, did you know that you can get an authenticated email that the listed sender never sent to you? If the third party can get a server to send it to themselves (for example Google forms will send them an email with the contents that they want) they can then forward it to you while spoofing the from: field as Google.com in this example, and it will appear in your inbox from the "sender" (Google.com) and appear as fully authenticated - even though Google never actually sent you that.

This is another example where you would think that "who it's for" is something the sender would sign but nope!

tennysont•3m ago
I asked about this on the PGP mailing list at one point, and I think I was told that the best solution is to start emails with "Hi <recipient>," which seems like a funny low-tech solution to a (sad) problem.
sillywabbit•32m ago
They've reinvented protobuf headers.
jeffrallen•29m ago
This is a nice explanation of an obvious idea. Both domain separation, and putting the domain signifier into the IDL are fine, but not novel.

Crypto is hard. Do it right. Get help from your tools. 'Nuff said.

Jeeze, I'm getting too old for this crap.

lukev•24m ago
So, isn't this a rather longwinded way to say that a signature only extends to the scope of the message it contains?

It doesn't matter if I sign the word "yes", if you don't know what question is being asked. The signature needs to included the necessary context for the signature to be meaningful.

Lots of ways of doing that, and you definitely need to be thoughtful about redundant data and storage overhead, but the concept isn't tricky.

cogman10•7m ago
Why not digest the type as part of the hash? This avoids the problem in the article and keeps the transmission size small.

Signing data structures the wrong way

https://blog.foks.pub/posts/domain-separation-in-idl/
44•malgorithms•1h ago•25 comments

Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs

https://github.com/hauntsaninja/git_bayesect
116•hauntsaninja•4d ago•13 comments

The revenge of the data scientist

https://hamel.dev/blog/posts/revenge/
30•hamelsmu•4d ago•5 comments

Show HN: Flight-Viz – 10K flights on a 3D globe in 3.5MB of Rust+WASM

https://flight-viz.com
26•coolwulf•4h ago•11 comments

EmDash – a spiritual successor to WordPress that solves plugin security

https://blog.cloudflare.com/emdash-wordpress/
388•elithrar•5h ago•278 comments

Ask HN: Who is hiring? (April 2026)

157•whoishiring•6h ago•131 comments

InspectMind AI (YC W24) Is Hiring

https://www.ycombinator.com/companies/inspectmind-ai/jobs/jQNra64-software-engineer-build-the-wor...
1•aakashprasad91•41m ago

Scientists crack a 20-year nuclear mystery behind the creation of gold

https://www.sciencedaily.com/releases/2026/03/260313002633.htm
20•prabal97•2h ago•2 comments

AI for American-produced cement and concrete

https://engineering.fb.com/2026/03/30/data-center-engineering/ai-for-american-produced-cement-and...
110•latchkey•4h ago•93 comments

StepFun 3.5 Flash is #1 cost-effective model for OpenClaw tasks (300 battles)

https://app.uniclaw.ai/arena?tab=costEffectiveness&via=hn
107•skysniper•5h ago•48 comments

Unsubscribe from the Church of Graphs

https://www.adorableandharmless.com/p/unsubscribe-from-the-church-of-graphs
20•devonnull•3h ago•10 comments

Windows 95 defenses against installers that overwrite a file with an older one

https://devblogs.microsoft.com/oldnewthing/20260324-00/?p=112159
76•michelangelo•3d ago•33 comments

NASA Artemis II moon mission live launch broadcast

https://plus.nasa.gov/scheduled-video/nasas-artemis-ii-crew-launches-to-the-moon-official-broadcast/
289•apitman•4h ago•204 comments

Jax's true calling: Ray-Marching renderers on WebGL

https://benoit.paris/posts/jax-ray-marcher/
16•BenoitP•2h ago•1 comments

CERN levels up with new superconducting karts

https://home.cern/news/news/engineering/cern-levels-new-superconducting-karts
371•fnands•14h ago•82 comments

Show HN: Zerobox – Sandbox any command with file, network, credential controls

https://github.com/afshinm/zerobox
71•afshinmeh•2d ago•71 comments

An Introduction to Writing Systems and Unicode

https://r12a.github.io/scripts/tutorial/part2
42•mariuz•3d ago•8 comments

SpaceX confidentially files to go public at $1.75T, reports say

https://www.theguardian.com/technology/2026/apr/01/spacex-public-offering-stock-market
48•bookofjoe•2h ago•7 comments

How-to guide: Commissioning a Sensor Physics R&D Lab

https://gist.github.com/nup002/912383615b12dc1ec44ae9004c40b11f
8•MagneLauritzen•2d ago•2 comments

The OpenAI graveyard: All the deals and products that haven't happened

https://www.forbes.com/sites/phoebeliu/2026/03/31/openai-graveyard-deals-and-products-havent-happ...
188•dherls•5h ago•143 comments

Claude wrote a full FreeBSD remote kernel RCE with root shell

https://github.com/califio/publications/blob/main/MADBugs/CVE-2026-4747/write-up.md
224•ishqdehlvi•16h ago•100 comments

Show HN: Real-time dashboard for Claude Code agent teams

https://github.com/simple10/agents-observe
60•simple10•5h ago•22 comments

Ask HN: Who wants to be hired? (April 2026)

39•whoishiring•6h ago•97 comments

Random numbers, Persian code: A mysterious signal transfixes radio sleuths

https://www.rferl.org/a/mystery-numbers-station-persian-signal-iran-war/33700659.html
94•thinkingemote•10h ago•96 comments

Is BGP safe yet?

https://isbgpsafeyet.com/
218•janandonly•8h ago•75 comments

The AI Marketing BS Index

https://bastian.rieck.me/blog/2026/bs/
71•speckx•3h ago•12 comments

Apple at 50

https://www.apple.com/
75•janandonly•2h ago•40 comments

Ada and Spark on ARM Cortex-M – A Tutorial with Arduino and Nucleo Examples

http://inspirel.com/articles/Ada_On_Cortex.html
49•swq115•4d ago•17 comments

Intuiting Pratt Parsing

https://louis.co.nz/2026/03/26/pratt-parsing.html
133•signa11•2d ago•42 comments

Consider the Greenland Shark (2020)

https://www.lrb.co.uk/the-paper/v42/n09/katherine-rundell/consider-the-greenland-shark
76•mooreds•5d ago•31 comments