Hashes should only be a reproducible label that cannot be used to produce the material described by the hash. When used for their intended purposes hashes serve as the strongest point of integrity until value collisions are discovered.
And the reason behind the problem outlined in the paper isn't a biased randomness problem but the fact that you can represent the hash function compared to a RO.
Some are designed that changing a bit has a massive influence on the resulting hash, others do the opposite.
However, if the negligible chance of a collision is what you are worried about, those also happen with random numbers.
Hash algorithms are none of that. They are not pseudo-randomness merely because a software developer merely wishes them to be so. Hash algorithms are intentionally designed to achieve high reproducibility in that a given set of input should always result in the same hash sequence as output. That intended reproducibility is by definition not random.
Most modern CPUs now contain a true RNG. They usually use some combination of a metastable latch, or thermal randomness through some kind of analog amplification. Bit strings from this are passed into a pseudorandom number generator to amplify the amount of random data generated.
There probably attacks on this too but it's much harder.
There is an untested assumption that hashes achieve randomness because they appear to be a random collection of characters. Hash sequences are completely reproducible given a set of input, and that is by definition not random.
I think you are confusing loss of prediction as randomness. Never in mathematics or logic is that line of thinking correct. This can be described by equivocation, fallacy of composition, inductive fallacy, and more.
The first sentence of the wikipedia entry on pseudo-randomness is:
"A pseudorandom sequence of numbers is one that appears to be statistically random, despite having been produced by a completely deterministic and repeatable process."
This is why security analysis requires a higher threshold than software employment at large.
How would this apply to hash algorithms, but not CSPRNGs?
A good hash function will preserve entropy, up to the length of its output. If the input X has K bits of entropy and H is an N-bit cryptographic hash function, then the entropy of H(X) is min(K, N). In simpler terms, GIGO.
However, a hash function also scrambles its input, which means the output is indistinguishable from (uniform) random noise. This is the randomizing property I was talking about. It is good enough for hash functions to be used to build stronger primitives, like HMACs, PBKDFs, CSPRNGs, etc. There are many formalizations of this property, but one of the simplest is that given any K bits of the output, you cannot predict the other N-K output bits any better than guessing, even knowing the algorithm.
Of course, if you know the input to a hash function, you can predict the output perfectly. But if you don't know the input, the hash looks like random noise, and for cryptographic hash functions, this is a very strong and fundamental guarantee indeed.
lol, no. Cryptographic hash functions are specifically designed to achieve this property.
> Never in mathematics or logic
Let's not get ahead of ourselves. Start with English - what does "pseudo" mean?
> This can be described by equivocation, fallacy of composition, inductive fallacy, and more.
For example, what is a pseudo-intellectual?
That completely ignores the definition of the word random.
What I find most interesting about this thread of comments is that the article explains the failure of using hashes as a means of randomness and despite that failure people are eager to ignore what hashes are otherwise used for to invent oppositional arguments. it's weird.
EDIT2: I'm doing a bad job of explaining this... you obviously need the keypair associated with the cert to initiate connections with it and not trigger MITM alerts. But if you break the hash function, you don't need the private key from the root cert, the verbatim signature from the original cert will appear to be valid when spliced into your forged cert if the hash digest computation on the forged cert is the same.
Classical DSA and ECDSA do not use hash functions this way, but in my opinion they aren't stronger for it: they're basically assuming instead that some other mathematical function "looks random", which seems riskier than assuming that about a hash function. I've heard that the reason for this is to get around Schnorr's patent on doing it with hash functions, which has since expired.
The SHA3 and SHAKE hash functions (underlying e.g. ML-DSA) are explicitly designed to "look random" as well.
There are some signature schemes that try not to make such strong assumptions: in particular SLH-DSA targets properties more like first- and second-preimage resistance, target-collision-resistance, and so on.
The ‘hash’ function is a deterministic transform, not a source of randomness.
Cryptographers know that hashes (even cryptographically strong ones!) are deterministic. Yet, it is possible that in going from an interactive proof to a non-interactive one, one does not actually need randomness. Indeed, for some class of protocols, we know how to design hash functions satisfying a particular property (correlation intractability) so that the resulting non-interactive proof is sound. It's just that (a) these hashes are inefficient, and (b) until now no one had found a non-contrived protocol where using standard hashes leads to an attack.
And to your criticism that this is just programmers who don’t know what they’re doing, these algorithms were developed by Bruce Schneier, Niels Ferguson, and John Kelsey.
That said, this is honestly just a bad article that is needlessly sensationalized and fails to impart any real knowledge.
There's a joke to be made here, since the issue is with zero-knowledge proof systems.
I honestly think that Quanta Magazine just found the perfect formula to farm HN traffic. The titles are clearly carefully engineered for this audience: not the outright clickbait of university press releases, but vague profoundness that lets us upvote without reading the whole thing and then chime in with tangential anecdotes.
I don't think they're bad people, but I honestly think they end up on the front page multiple times a week not on the quality of the articles alone.
Maybe all these elaborate analogies of Alice walking into a cave and Bob yelling which exit she should come out of, Alice wanting to sell Bob a Hamiltonian cycle trustlessly, Alice and Bob mixing buckets of paint and shipping them via the mail back and forth etc. are working for some people, but it's not me.
In those circumstances those millions of coins flying in or out are not a tragedy (at least for me) but a very plausible outcome.
Implementation mistakes leading to mass coin theft would certainly be cryptocurrency news, but they would not be crypto(graphy) news. Breaking an actual peer-reviewed zero knowledge proof scheme would be.
Eureka! I found the reason that so many things in society have gone to shit in the last few years. Far too many professors are so overworked or maybe just lazy and are using this type of tool to grade student work and the end result is that we have too many students passing through the system who have demonstrably only been able to score a 10/100.
I'm over 60 now and if I had scored lower than my current age back in the day I would fail and need to repeat the grade/course. Now they just kick the can('ts) on down the road and hope no one ever notices.
Too bad some of these failures end up in positions of influence where their uncharted deficiencies have the power to disrupt or destroy functional systems.
Or maybe I'm joking. I'll know once the caffeine hits.
the analogy is not great, but in cryptography something similar is at play (easy to get/check trivial properties and then hard to achieve/produce/fake the interesting ones)
Your Eureka moment seems to be misinformed - I hope you can have it returned for another occasion.
There's a difference between something with a probability of being true and another thing that is proven to be true. There are no doubts remaining after the proof whereas the probability always leaves wiggle room even if that wiggle room is a pretty tight space.
You are right - it is possible that is happens but not probable.
However, overly focusing on this really deprives you of a lot of great intellectual stimuli from randomized algorithms and, like here, a large chunk of cryptography.
I agree with your first points about using probable and possible carefully. I originally posted a bit of a tongue in cheek, carefully selected example, from the full text of the article since that example fit the point that I hoped to make in jest. It was my carefully selected random example.
I think the focus of the article is on demonstrating that a tool used in cryptography to verify truth was widely assumed to be infallible and it turns out that is unfortunately not true since it can be manipulated to identify false results as true. This tool uses probabilities as a tool to minimize compute times that would be enormous if one had to verify absolutes so it is an important tool. Now that it is demonstrated that it can be successfully attacked, the basis of the system of verification is vulnerable and in the case of Ethereum at least, monetary losses can result.
unfortunately most people doesn't understand (and as a consequence doesn't appreciate) how little wealth we have compared to capacity, in other words how much upkeep we do to maintain that wealth
and now that it's time to scale back the waste a bit people are up in arms
I feel like I have accomplished more than I actually have and so I have plenty of incentive now to sweep through all the work of the day hoping that each randomly selected set of results yields similar levels of perfection and that all the inaccurate answers assumed to be correct do not cause the student to make assumptions in later life about things that are not supportable by facts.
I processed hundreds of thousands of miles of seismic data in my career. The first thing we needed to do for any processing project was to select a subset of the data for use in defining the parameters that would be used to process the full volume of data. We used brute stacks - a preliminary output in the process - to locate interesting areas with complex attributes to make sure we could handle the volume's complexities. In industry slang these were "carefully selected random examples".
It was inevitable that we would find a situation during the processing for which our parameters were not optimized because we had missed that edge case in selecting the examples used for testing.
In the same way in real life if you only demonstrably know that 10% of the test answers are correct then it is also true that some edge case in the 90% of untested answers could leave all or part of that untested space false or sub-optimum.
If there was a point to my original post it is this. You only know something is true when you have proven it to be true. A maximum likelihood of truth is not a guarantee of truth it is only a guarantee that it is likely to be true. You won't know how sharp the sting will be until you peel the onion.
Probabilities aren't a matter of faith, they're mathematics and as such represent a logical trueism. Your critiques are just nitpicking for the sake of it and void of substance. Have a coffee and leave this topic.
I'm a full pot in now and find that I agree with this.
The fact is though that the article demonstrates that something previously considered logically true or a maximum likelihood was proven false.
It's great to see that there are people, whether they're mathematicians or cryptographers, who will take a look at something that has been a useful part of a stable process of verification and try to find cracks or instabilities. The fact that they found an edge case that could be exploitable undermines trust in an important part of the process.
Trust - but verify, wins again. Logically, this is the best way.
When my son says, "I wasn't there" when the glass broke, it's not just a matter of differing descriptions, it's a clear denial of a fact. There are facts, and when someone deliberately twists or denies them, that's not just a different perspective. That's a lie.
The fact that is difficult to find out the truth, does not mean that something was or wasn't a lie. Paradoxically this is an attribute of a "good" lie: it is difficult to find out that it wasn't the truth.
There is a physical component to this lie but it seems to me that the social part dominates.
If you say that "it is difficult to find the truth", aside from the blatant subjectivity of your claim, then , if we believe your statement, then that itself must be a "truth". And yet you invalidate your own claim immediately by saying "different for different persons". Well, if that statement is true, then it invalidates itself as well.
You cannot disprove the existence of truth by using a system that relies on truth and falsehood.
For example, the movement of things exists, but Newton's understanding of this existence is different from Einstein's understanding of the same existence (or the movement of more things).
In this case they are talking about mathematical truth, which is a case of Coherence truth.
It would be nice if the article included timelines. Ethereum researchers have been talking about GKR since 2020,so it's hard to imagine the lack of familiarity.
It's hard to align what's being researched on Ethresar.ch and this statement.
> In the early 2000s, computer scientists showed how to do just that, contriving interactive proof protocols that were specifically designed to fail when they underwent Fiat-Shamir
Indeed, the artcile points out that targeting GKR was the idea of the Ethereum Foundation researcher.
> Soukhanov had the idea to target a Fiat-Shamir proof system based on something called the GKR protocol
Am I missing something? Or maybe the point is that, under the random oracle model, it should be hard to write a program that contains its own hash? But then again, would the trick of reading the hash from an external configuration file that isn't considered as part of the hashing be fair game?
However the paper shows that there in fact exists a pretty simple way to break the Fiat Shamir heuristic, for a protocol operating in the RO model. And such kind of efficient attacks are rather concerning in cryptography land.
So this isn't about the attack per se, rather it's about the existence of such an easy one.
Previous examples which showed how instantiating Fiat Shamir leads to an unsound protocol were so contrived that people use to think that they were a testament to how unlikely breaking FS would be [1].
In "How to Prove False Statements", you can actually build what they show.
[1]: e.g. see https://eprint.iacr.org/1998/011.pdf
They gave a (maliciously constructed) program whose outputs are pairs (a,b) where certainly a != b (instead the program is constructed such that a = b+1 always). But you can get the corresponding Fiat-Shamir protocol to accept the statement "I know a secret x such that Program(x) = (0,0)", which is clearly a false statement.
If you view random numbers as normal numbers, they will seem to be algorithmically random, or at least exceed the complexity of any proof, or even practical proof.
Basically the work of Chatlin, where given the kolmogorov complexity of your verifier, you only have a limited number of bits in any L that you can prove anything.
Probably simpler to think about the challenges of proving a fair coins is fair.
They just have to produce an unfair coin that looks fair as a flawed analogy.
Fiat-Shamir depends on interactive proofs, which equals PSPACE, which seems huge, but it can be a hay in the haystack, and if you are using a magnet to reach into the haystack you will almost never pull out a piece of hay.
They are basically choosing the magnet for you.
Cryptographic protocols often feature coin tosses. The idea is that if we replace a hash function in place of the protocol coin tosses, it should still be hard for a malicious prover to craft a false statement with an accepting proof - making the protocol unsound.
Roughly, the meat of the attack consists in baking in the statement being proven the ability for the prover to predict upfront how the hash function is going to behave - hereby breaking the Fiat Shamir heuristic and making the prover able to craft a false statement with an accepting proof.
That’s it, this is “How to Prove False Statements“!
So, if the prover can know beforehand how an hash function behaves, wouldn't this make it a more general attack on hash functions (so potentially even worse than how it is presented in the article) and the Fiat-Shamir transform is only broken as a consequence of it relying on an hash function? If not, why?
This goes back to the rather fuzzy distinction between “data” and “program” you may remember from your early CS days. More precisely, from a theoretical CS perspective, there is no solid difference between data and program.
Almost all practical ZK schemes require the user to choose some input (eg the root of the merkle tree representing the “current state” of the blockchain and secrets like keys and the amount of a transaction).
From some perspective, you get a different program for each different input; sometimes people call this “currying” or “partial evaluation”.
So yeah, it’s more serious than it seems at first blush.
That rather clearly goes wildly beyond what most ZK schemes use. That's arbitrary code execution of your choice, either as input or as part of selecting the program. Which seems like it puts this somewhere near "if you allow `eval` on user input in your script, it could do anything", doesn't it?
Plus like. They fixed it. That seems to imply it's more of an implementation flaw than a fundamental, even if it may be a surprisingly achievable one.
The problem is compounded because the hash functions are typically chosen to have extremely short polynomial representations.
There were parts of the process that check whether its a valid proof, which were previously thought to be equivalent, which were in fact not the case. Computer scientists involved knew of cases where this was not the case but moved ahead anyway because no one would design the systems in the ways the attacks would work.
Attacks only get better.
The original process included the original hash in such systems inputs, but also allowed additional malicious features which could be included to rearrange the output in a way that passes the proof scheme checks despite it being incorrect.
By placing a constraint on input entropy they believe they've mitigated the issue, but it also breaks many applications; with no good alternative.
Imo, This is a weak assertion considering finite fields are used. The title is really misleading.
It should be "Fiat Shamir is broken"; Practical attacks.
These are finite fields so the token output generated doesn't necessarily correspond to the specific path taken.
There may be an infinite many paths, and computation has classical problems of computer science with being able to automatically derive decidable paths from tokens given; potentially leading to the same issues of discovering simple breaks.
Despite that, it's nonetheless "malicious" in that, with the modifications made to it, FS can be made to "prove" false things about it. So you can "prove" that M(x)=y, even though when you actually run M(x), you find that you don't get y.
Simply put, a reliable random oracle in an adversarial environment should be based on sources of randomness from multiple sources and participants, usually the sources are the participants’ meaningful actions to prevent collusion.
It has been known for quite a while that if the space of inputs being hashed is small, the hashing is relatively useless for most benefits of a true one-way function (eg hashing a phone number in USA).
tempodox•12h ago
sheiyei•11h ago
The paper is half a year old, and hasn't made a splash; if this were significant news, I would expect to be able to find more coverage on it.
I did find this more nuanced take here: https://blog.cryptographyengineering.com/2025/02/04/how-to-p...
I haven't seen much of Quanta "Magazine", but I feel all of it has been stuff like this?
yorwba•10h ago
verandaguy•10h ago
They had an article just the other day about a more optimal sphere packing that was up my alley as a technical (programmer) person with a casual interest in broader pure math.
They do sensationalize a bit as a side effect of their process though, no argument there.
pas•10h ago
intalentive•7h ago
karel-3d•11h ago
edit: it seems to be related to something called "GKR protocol" that some cryptocurrencies use (?) - can use (?) - for somehow proving ... something? mining?.. using zero-knowledge proofs.... like here - https://www.polyhedra.network/expander (as usual in cryptocurrency, hard to tell what is actually being done/sold)
what I take from this, as a laic, is that... experimental ZK-proofs are indeed experimental.
lxgr•8h ago
baobun•2h ago
There are usually "bridge contracts" deployed on Mainnet to allow briding assets/tokens between them. This (besides obv exchanges) is where most of the ridiculous hacks and online theft of past few years have happened. The Axie/Ronin hack was a huge facepalm and should have been a lesson to be more wary of handwavy security claims of these more experimental networks.
bluGill•9h ago
I'm not sure if they can trace the fraud to you.
fract0l•9h ago
mckirk•9h ago
bluGill•8h ago
__MatrixMan__•8h ago
We'll need to find our way out of that logic eventually. Scarcity in general and proof of work in particular are terrible bases for an economy. But it is a respectable foe.
bluGill•5h ago
__MatrixMan__•4h ago
lxgr•8h ago
That would be somewhat ironic, given the "code is law" mentality of many blockchain proponents.
I don't doubt that many people would file police reports and lawsuits if any fundamental paradigm of blockchain cryptography were to suddenly be revealed as insecure, but I'd be following the lawsuits with a big bowl of popcorn.
cypherpunks01•6h ago
I'd think that if NK was sitting on a $1-10 billion Bitcoin bug, they'd execute it too before it got fixed or exploited by someone else.
bobbiechen•7h ago
There exists the concept of a zero-knowledge proof: check out the Wikipedia page for some intuitive examples of how these work in an interactive context. Basically, by asking someone who wants to prove something (the prover) a bunch of questions (challenges), you can get probabilistic confidence that they actually know that thing: https://en.wikipedia.org/wiki/Zero-knowledge_proof#Abstract_...
You want it to be interactive because that makes it much harder for the prover to "fake it" on the spot. But it would be more convenient if you didn't need to be online and actively talking to each other - so we want a non-interactive way to do the same thing.
The Fiat-Shamir transform (or heuristic) says that we can transform interactive protocols into non-interactive ones by relying on "random" challenges. If the prover can't control the randomness, then it's about as good as you interactively challenging them (and you can e.g. make them do more challenges to make up for it).
How do we get randomness? In computing we don't really have anything totally random, but cryptographic hash functions are believed to be very difficult to predict the output to. So, in cryptography there's the "random oracle model" where you say, "Well, I don't know if this protocol is safe with these real-life hashes. But if the hash function was a truly random oracle, I can prove it's safe." (The Fiat-Shamir transform is only provably secure if you believe in the random oracle model).
In the past, researchers have constructed new protocols that are safe in the random oracle model, but once you use a real hash function they're breakable because of real-world implementation details. As the abstract of this paper says, "So far, all of these examples have been contrived protocols that were specifically designed to fail." See https://crypto.stackexchange.com/q/879 for some discussion of the mechanics of how it might happen, once you choose a real hash function.
This new paper advances the field by showing an attack that targets a real-world protocol that people actually use, GKR. It shows (and again, take my interpretation with a grain of salt) that when you pick a real hash function, the attacker can construct an input (circuit) that results in whatever output the attacker wants.
---
What's the real-world impact?
There do exist real non-interactive zero-knowledge proof systems, mainly used in blockchains. Instead of publicly exposing all the info to the world and doing computation on the (slow) blockchain, you can protect privacy of transactions and/or bundle a bunch of updates into a cheaper one (ZK-rollups). Theoretically these could be attacked using the methods described in the paper.
It's unclear to me whether those are affected here (though my guess is no, since they could have mentioned it if so).
cypherpunks01•6h ago