Great write-up that hits on a subtle but crucial aspect of cryptographic design: hashing semantics, not syntax.
It's easy to forget that serialization isn’t just a technical detail—it defines how we interpret structure. As the post shows, naïvely hashing raw byte sequences from a list or set can open the door to ambiguity and even hash collisions, especially if separators or length fields are poorly chosen.
I appreciated the comparison of structured serialization (like JSON or Protobuf) vs. Merkle trees. Both are valid tools, but come with tradeoffs:
Structured formats give you clarity and alignment with meaning, but may be heavier.
Merkle trees are elegant for composability and verification, but need careful treatment of domain separation and tree shape (power-of-two edge cases, etc).
RFC 6962’s byte-prefix trick is underused wisdom—more people should know about it.
Nice reminder that cryptographic correctness often lives in the edge cases.
Calliope1•4h ago
It's easy to forget that serialization isn’t just a technical detail—it defines how we interpret structure. As the post shows, naïvely hashing raw byte sequences from a list or set can open the door to ambiguity and even hash collisions, especially if separators or length fields are poorly chosen.
I appreciated the comparison of structured serialization (like JSON or Protobuf) vs. Merkle trees. Both are valid tools, but come with tradeoffs:
Structured formats give you clarity and alignment with meaning, but may be heavier.
Merkle trees are elegant for composability and verification, but need careful treatment of domain separation and tree shape (power-of-two edge cases, etc).
RFC 6962’s byte-prefix trick is underused wisdom—more people should know about it.
Nice reminder that cryptographic correctness often lives in the edge cases.