Neat! The idea is that _all_ output embeddings must lie on a given ellipse. (On a hypersphere due to layernorm, distorted to an ellipse by the final linear layer).
Since the ellipse is given by the parameters of the model, it is characteristic to the model. And, you can pretty easily verify if a given embedding (probably) came from that model or not simply by checking if it lies on that ellipse.
Recovering the ellipse without access to the model weights takes large number of embeddings, so not terribly practical.
This easy-to-verify hard-to-forge property could naturally lend itself to use for fingerprinting. Noting that they call out it’s not cryptographic grade.
nighthawk454•19h ago
Since the ellipse is given by the parameters of the model, it is characteristic to the model. And, you can pretty easily verify if a given embedding (probably) came from that model or not simply by checking if it lies on that ellipse.
Recovering the ellipse without access to the model weights takes large number of embeddings, so not terribly practical.
This easy-to-verify hard-to-forge property could naturally lend itself to use for fingerprinting. Noting that they call out it’s not cryptographic grade.
mattfinlayson•8h ago