Hashes should only be a reproducible label that cannot be used to produce the material described by the hash. When used for their intended purposes hashes serve as the strongest point of integrity until value collisions are discovered.
And the reason behind the problem outlined in the paper isn't a biased randomness problem but the fact that you can represent the hash function compared to a RO.
Some are designed that changing a bit has a massive influence on the resulting hash, others do the opposite.
Hash algorithms are none of that. They are not pseudo-randomness merely because a software developer merely wishes them to be so. Hash algorithms are intentionally designed to achieve high reproducibility in that a given set of input should always result in the same hash sequence as output. That intended reproducibility is by definition not random.
Most modern CPUs now contain a true RNG. They usually use some combination of a metastable latch, or thermal randomness through some kind of analog amplification. Bit strings from this are passed into a pseudorandom number generator to amplify the amount of random data generated.
There probably attacks on this too but it's much harder.
There is an untested assumption that hashes achieve randomness because they appear to be a random collection of characters. Hash sequences are completely reproducible given a set of input, and that is by definition not random.
I think you are confusing loss of prediction as randomness. Never in mathematics or logic is that line of thinking correct. This can be described by equivocation, fallacy of composition, inductive fallacy, and more.
The first sentence of the wikipedia entry on pseudo-randomness is:
"A pseudorandom sequence of numbers is one that appears to be statistically random, despite having been produced by a completely deterministic and repeatable process."
This is why security analysis requires a higher threshold than software employment at large.
That's not what entropy means, but perhaps reflect on how this statement would apply to hash algorithms without applying to CSPRNGs.
I don't know if videos are your thing but this series helped me understand entropy a lot better.
https://youtube.com/playlist?list=PLkyBCj4JhHt_kmOgzaU09J-XP...
lol, no. Cryptographic hash functions are specifically designed to achieve this property.
> Never in mathematics or logic
Let's not get ahead of ourselves. Start with English - what does "pseudo" mean?
> This can be described by equivocation, fallacy of composition, inductive fallacy, and more.
For example, what is a pseudo-intellectual?
EDIT2: I'm doing a bad job of explaining this... you obviously need the keypair associated with the cert to initiate connections with it and not trigger MITM alerts. But if you break the hash function, you don't need the private key from the root cert, the verbatim signature from the original cert will appear to be valid when spliced into your forged cert if the hash digest computation on the forged cert is the same.
Classical DSA and ECDSA do not use hash functions this way, but in my opinion they aren't stronger for it: they're basically assuming instead that some other mathematical function "looks random", which seems riskier than assuming that about a hash function. I've heard that the reason for this is to get around Schnorr's patent on doing it with hash functions, which has since expired.
The SHA3 and SHAKE hash functions (underlying e.g. ML-DSA) are explicitly designed to "look random" as well.
There are some signature schemes that try not to make such strong assumptions: in particular SLH-DSA targets properties more like first- and second-preimage resistance, target-collision-resistance, and so on.
The ‘hash’ function is a deterministic transform, not a source of randomness.
Cryptographers know that hashes (even cryptographically strong ones!) are deterministic. Yet, it is possible that in going from an interactive proof to a non-interactive one, one does not actually need randomness. Indeed, for some class of protocols, we know how to design hash functions satisfying a particular property (correlation intractability) so that the resulting non-interactive proof is sound. It's just that (a) these hashes are inefficient, and (b) until now no one had found a non-contrived protocol where using standard hashes leads to an attack.
And to your criticism that this is just programmers who don’t know what they’re doing, these algorithms were developed by Bruce Schneier, Niels Ferguson, and John Kelsey.
That said, this is honestly just a bad article that is needlessly sensationalized and fails to impart any real knowledge.
There's a joke to be made here, since the issue is with zero-knowledge proof systems.
I honestly think that Quanta Magazine just found the perfect formula to farm HN traffic. The titles are clearly carefully engineered for this audience: not the outright clickbait of university press releases, but vague profoundness that lets us upvote without reading the whole thing and then chime in with tangential anecdotes.
I don't think they're bad people, but I honestly think they end up on the front page multiple times a week not on the quality of the articles alone.
Maybe all these elaborate analogies of Alice walking into a cave and Bob yelling which exit she should come out of, Alice wanting to sell Bob a Hamiltonian cycle trustlessly, Alice and Bob mixing buckets of paint and shipping them via the mail back and forth etc. are working for some people, but it's not me.
In those circumstances those millions of coins flying in or out are not a tragedy (at least for me) but a very plausible outcome.
Implementation mistakes leading to mass coin theft would certainly be cryptocurrency news, but they would not be crypto(graphy) news. Breaking an actual peer-reviewed zero knowledge proof scheme would be.
Eureka! I found the reason that so many things in society have gone to shit in the last few years. Far too many professors are so overworked or maybe just lazy and are using this type of tool to grade student work and the end result is that we have too many students passing through the system who have demonstrably only been able to score a 10/100.
I'm over 60 now and if I had scored lower than my current age back in the day I would fail and need to repeat the grade/course. Now they just kick the can('ts) on down the road and hope no one ever notices.
Too bad some of these failures end up in positions of influence where their uncharted deficiencies have the power to disrupt or destroy functional systems.
Or maybe I'm joking. I'll know once the caffeine hits.
the analogy is not great, but in cryptography something similar is at play (easy to get/check trivial properties and then hard to achieve/produce/fake the interesting ones)
Your Eureka moment seems to be misinformed - I hope you can have it returned for another occasion.
There's a difference between something with a probability of being true and another thing that is proven to be true. There are no doubts remaining after the proof whereas the probability always leaves wiggle room even if that wiggle room is a pretty tight space.
You are right - it is possible that is happens but not probable.
However, overly focusing on this really deprives you of a lot of great intellectual stimuli from randomized algorithms and, like here, a large chunk of cryptography.
I agree with your first points about using probable and possible carefully. I originally posted a bit of a tongue in cheek, carefully selected example, from the full text of the article since that example fit the point that I hoped to make in jest. It was my carefully selected random example.
I think the focus of the article is on demonstrating that a tool used in cryptography to verify truth was widely assumed to be infallible and it turns out that is unfortunately not true since it can be manipulated to identify false results as true. This tool uses probabilities as a tool to minimize compute times that would be enormous if one had to verify absolutes so it is an important tool. Now that it is demonstrated that it can be successfully attacked, the basis of the system of verification is vulnerable and in the case of Ethereum at least, monetary losses can result.
unfortunately most people doesn't understand (and as a consequence doesn't appreciate) how little wealth we have compared to capacity, in other words how much upkeep we do to maintain that wealth
and now that it's time to scale back the waste a bit people are up in arms
I feel like I have accomplished more than I actually have and so I have plenty of incentive now to sweep through all the work of the day hoping that each randomly selected set of results yields similar levels of perfection and that all the inaccurate answers assumed to be correct do not cause the student to make assumptions in later life about things that are not supportable by facts.
I processed hundreds of thousands of miles of seismic data in my career. The first thing we needed to do for any processing project was to select a subset of the data for use in defining the parameters that would be used to process the full volume of data. We used brute stacks - a preliminary output in the process - to locate interesting areas with complex attributes to make sure we could handle the volume's complexities. In industry slang these were "carefully selected random examples".
It was inevitable that we would find a situation during the processing for which our parameters were not optimized because we had missed that edge case in selecting the examples used for testing.
In the same way in real life if you only demonstrably know that 10% of the test answers are correct then it is also true that some edge case in the 90% of untested answers could leave all or part of that untested space false or sub-optimum.
If there was a point to my original post it is this. You only know something is true when you have proven it to be true. A maximum likelihood of truth is not a guarantee of truth it is only a guarantee that it is likely to be true. You won't know how sharp the sting will be until you peel the onion.
Probabilities aren't a matter of faith, they're mathematics and as such represent a logical trueism. Your critiques are just nitpicking for the sake of it and void of substance. Have a coffee and leave this topic.
I'm a full pot in now and find that I agree with this.
The fact is though that the article demonstrates that something previously considered logically true or a maximum likelihood was proven false.
It's great to see that there are people, whether they're mathematicians or cryptographers, who will take a look at something that has been a useful part of a stable process of verification and try to find cracks or instabilities. The fact that they found an edge case that could be exploitable undermines trust in an important part of the process.
Trust - but verify, wins again. Logically, this is the best way.
When my son says, "I wasn't there" when the glass broke, it's not just a matter of differing descriptions, it's a clear denial of a fact. There are facts, and when someone deliberately twists or denies them, that's not just a different perspective. That's a lie.
The fact that is difficult to find out the truth, does not mean that something was or wasn't a lie. Paradoxically this is an attribute of a "good" lie: it is difficult to find out that it wasn't the truth.
There is a physical component to this lie but it seems to me that the social part dominates.
If you say that "it is difficult to find the truth", aside from the blatant subjectivity of your claim, then , if we believe your statement, then that itself must be a "truth". And yet you invalidate your own claim immediately by saying "different for different persons". Well, if that statement is true, then it invalidates itself as well.
You cannot disprove the existence of truth by using a system that relies on truth and falsehood.
For example, the movement of things exists, but Newton's understanding of this existence is different from Einstein's understanding of the same existence (or the movement of more things).
In this case they are talking about mathematical truth, which is a case of Coherence truth.
It would be nice if the article included timelines. Ethereum researchers have been talking about GKR since 2020,so it's hard to imagine the lack of familiarity.
It's hard to align what's being researched on Ethresar.ch and this statement.
Am I missing something? Or maybe the point is that, under the random oracle model, it should be hard to write a program that contains its own hash? But then again, would the trick of reading the hash from an external configuration file that isn't considered as part of the hashing be fair game?
However the paper shows that there in fact exists a pretty simple way to break the Fiat Shamir heuristic, for a protocol operating in the RO model. And such kind of efficient attacks are rather concerning in cryptography land.
So this isn't about the attack per se, rather it's about the existence of such an easy one.
They gave a (maliciously constructed) program whose outputs are pairs (a,b) where certainly a != b (instead the program is constructed such that a = b+1 always). But you can get the corresponding Fiat-Shamir protocol to accept the statement "I know a secret x such that Program(x) = (0,0)", which is clearly a false statement.
If you view random numbers as normal numbers, they will seem to be algorithmically random, or at least exceed the complexity of any proof, or even practical proof.
Basically the work of Chatlin, where given the kolmogorov complexity of your verifier, you only have a limited number of bits in any L that you can prove anything.
Probably simpler to think about the challenges of proving a fair coins is fair.
They just have to produce an unfair coin that looks fair as a flawed analogy.
Fiat-Shamir depends on interactive proofs, which equals PSPACE, which seems huge, but it can be a hay in the haystack, and if you are using a magnet to reach into the haystack you will almost never pull out a piece of hay.
They are basically choosing the magnet for you.
This goes back to the rather fuzzy distinction between “data” and “program” you may remember from your early CS days. More precisely, from a theoretical CS perspective, there is no solid difference between data and program.
Almost all practical ZK schemes require the user to choose some input (eg the root of the merkle tree representing the “current state” of the blockchain and secrets like keys and the amount of a transaction).
From some perspective, you get a different program for each different input; sometimes people call this “currying” or “partial evaluation”.
So yeah, it’s more serious than it seems at first blush.
That rather clearly goes wildly beyond what most ZK schemes use. That's arbitrary code execution of your choice, either as input or as part of selecting the program. Which seems like it puts this somewhere near "if you allow `eval` on user input in your script, it could do anything", doesn't it?
Plus like. They fixed it. That seems to imply it's more of an implementation flaw than a fundamental, even if it may be a surprisingly achievable one.
The problem is compounded because the hash functions are typically chosen to have extremely short polynomial representations.
tempodox•7h ago
sheiyei•6h ago
The paper is half a year old, and hasn't made a splash; if this were significant news, I would expect to be able to find more coverage on it.
I did find this more nuanced take here: https://blog.cryptographyengineering.com/2025/02/04/how-to-p...
I haven't seen much of Quanta "Magazine", but I feel all of it has been stuff like this?
yorwba•5h ago
verandaguy•5h ago
They had an article just the other day about a more optimal sphere packing that was up my alley as a technical (programmer) person with a casual interest in broader pure math.
They do sensationalize a bit as a side effect of their process though, no argument there.
pas•5h ago
intalentive•2h ago
karel-3d•6h ago
edit: it seems to be related to something called "GKR protocol" that some cryptocurrencies use (?) - can use (?) - for somehow proving ... something? mining?.. using zero-knowledge proofs.... like here - https://www.polyhedra.network/expander (as usual in cryptocurrency, hard to tell what is actually being done/sold)
what I take from this, as a laic, is that... experimental ZK-proofs are indeed experimental.
lxgr•3h ago
bluGill•4h ago
I'm not sure if they can trace the fraud to you.
fract0l•4h ago
mckirk•4h ago
bluGill•3h ago
__MatrixMan__•3h ago
We'll need to find our way out of that logic eventually. Scarcity in general and proof of work in particular are terrible bases for an economy. But it is a respectable foe.
bluGill•37m ago
lxgr•3h ago
That would be somewhat ironic, given the "code is law" mentality of many blockchain proponents.
I don't doubt that many people would file police reports and lawsuits if any fundamental paradigm of blockchain cryptography were to suddenly be revealed as insecure, but I'd be following the lawsuits with a big bowl of popcorn.
cypherpunks01•1h ago
I'd think that if NK was sitting on a $1-10 billion Bitcoin bug, they'd execute it too before it got fixed or exploited by someone else.
bobbiechen•2h ago
There exists the concept of a zero-knowledge proof: check out the Wikipedia page for some intuitive examples of how these work in an interactive context. Basically, by asking someone who wants to prove something (the prover) a bunch of questions (challenges), you can get probabilistic confidence that they actually know that thing: https://en.wikipedia.org/wiki/Zero-knowledge_proof#Abstract_...
You want it to be interactive because that makes it much harder for the prover to "fake it" on the spot. But it would be more convenient if you didn't need to be online and actively talking to each other - so we want a non-interactive way to do the same thing.
The Fiat-Shamir transform (or heuristic) says that we can transform interactive protocols into non-interactive ones by relying on "random" challenges. If the prover can't control the randomness, then it's about as good as you interactively challenging them (and you can e.g. make them do more challenges to make up for it).
How do we get randomness? In computing we don't really have anything totally random, but cryptographic hash functions are believed to be very difficult to predict the output to. So, in cryptography there's the "random oracle model" where you say, "Well, I don't know if this protocol is safe with these real-life hashes. But if the hash function was a truly random oracle, I can prove it's safe." (The Fiat-Shamir transform is only provably secure if you believe in the random oracle model).
In the past, researchers have constructed new protocols that are safe in the random oracle model, but once you use a real hash function they're breakable because of real-world implementation details. As the abstract of this paper says, "So far, all of these examples have been contrived protocols that were specifically designed to fail." See https://crypto.stackexchange.com/q/879 for some discussion of the mechanics of how it might happen, once you choose a real hash function.
This new paper advances the field by showing an attack that targets a real-world protocol that people actually use, GKR. It shows (and again, take my interpretation with a grain of salt) that when you pick a real hash function, the attacker can construct an input (circuit) that results in whatever output the attacker wants.
---
What's the real-world impact?
There do exist real non-interactive zero-knowledge proof systems, mainly used in blockchains. Instead of publicly exposing all the info to the world and doing computation on the (slow) blockchain, you can protect privacy of transactions and/or bundle a bunch of updates into a cheaper one (ZK-rollups). Theoretically these could be attacked using the methods described in the paper.
It's unclear to me whether those are affected here (though my guess is no, since they could have mentioned it if so).
cypherpunks01•1h ago