The math of shuffling cards almost brought down an online poker empire

https://www.scientificamerican.com/article/how-the-math-of-shuffling-cards-almost-brought-down-an-online-poker-empire/

98•indigodaddy•5mo ago

Comments

ajross•5mo ago

Article goes through a mostly irrelevant discussion about "52 factorial is a really really big number" before getting to the computer science meat of the headline:

RNG was seeded with a millisecond time of day, leading to 86.4M possible decks. Oooph.

sejje•5mo ago

It didn't even go into the fact that it got exploited, which it did.

And I don't think it almost brought them down.

shagie•5mo ago

The number 52! is absurdly large. While the article is behind a paywall and so I can't see its full discussion, I'm reminded of https://czep.net/weblog/52cards.html that has a bit on the "how large is the number"

> Start by picking your favorite spot on the equator. You're going to walk around the world along the equator, but take a very leisurely pace of one step every billion years. The equatorial circumference of the Earth is 40,075,017 meters. Make sure to pack a deck of playing cards, so you can get in a few trillion hands of solitaire between steps. After you complete your round the world trip, remove one drop of water from the Pacific Ocean. Now do the same thing again: walk around the world at one billion years per step, removing one drop of water from the Pacific Ocean each time you circle the globe. The Pacific Ocean contains 707.6 million cubic kilometers of water. Continue until the ocean is empty.

And it goes on.

Numbers like Tree(3) or Loader's number or the various "these are really big mathematical functions" ( relevant xkcd https://xkcd.com/207/ )... we know we can't comprehend them. But 52! - that's something that we think we can get our head around... and it's mind boggling large once you start doing the math and trying to relate it to real things that we can relate to (which are mind bogglingly large).

gucci-on-fleek•5mo ago

52! < 2^226, so it's not that big in computer terms. It's smaller than a single AVX2 register, a ChaCha20/AES-256/Curve25519 key, a pair of IPv6 addresses, etc. Still mindbogglingly huge, but not uncommonly large.

danielheath•5mo ago

Right, but storing 52! distinct values isn’t something a computer is going to be able to do.

zmgsabst•5mo ago

So?

You’re only using one deck at a time; so you only need to generate 1 bit randomly 226 times — then use that deck.

bobbylarrybobby•5mo ago

Without a bunch of (unentangled) electrons you can observe, getting 226 truly random bits in quick succession doesn't seem all that easy.

alexey-salmin•5mo ago

Depends on what you mean by "truly random" but if it's "cryptographically secure random" then it's not particularly hard.

There are strong monetary incentives to break all kinds of private keys but most of the time they hold pretty well.

Sanzig•5mo ago

RDRAND has a throughput of hundreds of megabytes per second on most processors. Getting 226 bits is pretty easy, even shuffling millions of decks per second you'd be bottlenecked elsewhere.

(That does sound like a fun optimization challenge: "write a program to fairly shuffle the most decks of cards per second possible.")

jdietrich•5mo ago

Quantum random number generators are commercially available, relatively inexpensive and can supply hundreds of megabits per second of truly random data from a single PCIe card.

adastra22•5mo ago

A quantum computer can.

alexey-salmin•5mo ago

Well neither can a human.

The whole premise of computers being unable to do shuffling as good as humans is ridiculous. 52! is huge but log(52!) is not.

If anything it's humans (unless specifically trained) who produce horribly biased shuffles. Whole chunks of sequences from the previous deal can show up, either directly or spaced 2x or 4x. Try ordering a deck first, then shuffle it, then open each card and see if you notice any patterns.

kerkeslager•5mo ago

Sure, it's a big number, but nobody was ever approaching the problem of generating a randomly shuffled deck by generating all 52! possibilities and choosing one of them.

sneak•5mo ago

> Card dealers create a unique deck with each shuffle, something computers cannot replicate

This is entirely, unequivocally false.

Shuffling as an algorithm to implement is easy to fuck up, but if you use a good one and a true RNG source, computers can shuffle better than humans - just as randomly, and many orders of magnitude faster.

smallerize•5mo ago

I think the RNG source is the point here. It's not as impossible as stated, but you have to put some work in to get a good one.

awesome_dude•5mo ago

Go (and probably other languages) has the Fisher-Yates algorithm for shuffling

https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle

genghisjahn•5mo ago

It’s comments like this that keep me coming back to HN. I’d never heard of this before.

andrewflnr•5mo ago

Really? It repeats a point from the article, and in so doing utterly misses the point of the comment it's replying to. Fisher-Yates is at least not broken, but it can only be as good as the underlying RNG.

awesome_dude•5mo ago

Go has a cryptographically secure RNG now, I wouldn't be surprised if it's being used for the shuffle.

Also, I'm keen to see what shuffle algorithms you know of that aren't susceptible to RNG issues

kerkeslager•5mo ago

1. Every remotely competent language has a secure (P)RNG and is using it to Fisher-Yates shuffle. Go is not special in this regard.

2. Nobody in this thread is criticizing Fisher-Yates, because in all likelihood all of us are using Fisher-Yates. We're discussing the failure of the algorithm used in the article.

3. Please take the time to read and understand the posts you respond to before you respond to them.

awesome_dude•5mo ago

In case anyone is following along, the hostility the above poster is demonstrating is because there isn't any shuffling algorithm that isn't susceptible to the RNG issue, and he didn't think about the cryptographically secure RNGs in use today.

kerkeslager•5mo ago

My friend, I am not hostile. I'm merely pointing out that you have not understood a single post you've responded to. Including the one you responded to just now.

The following algorithm is not susceptible to RNG issues (in Rust, since I've been playing around with it):

    fn shuffle(vec: &Vec) -> () {
    }

And I'm sure since you're reading all these posts so carefully you'll totally understand why this is both a joke and a proof that you don't know what you're talking about.

awesome_dude•5mo ago

More snark, thus proving the initial assessment.

kerkeslager•5mo ago

A smart move, not responding on topic, since you clearly don't understand the topic. Instead, just accuse me of being snarky! Guilty as charged, I guess, but at least I can read and understand what I'm responding to.

andrewflnr•5mo ago

> I wouldn't be surprised if it's being used for the shuffle.

That would certainly be a good thing to check before using Go's shuffle for real-money poker games. I wouldn't take it for granted.

> Also, I'm keen to see what shuffle algorithms you know of that aren't susceptible to RNG issues

There are not and cannot be any such algorithms. That's the entire point of this thread. Fisher-Yates is necessary but not sufficient. You either have sufficient random bits for your shuffle or not.

namibj•5mo ago

As a sort of TL;DR: it's basically like selection sort but instead of selecting the sort-compliant choice among the remaining ones, you ask the RNG and pick a random one.

duskwuff•5mo ago

However, note that using a random function as the comparator for a sort function does not generally result in an unbiased shuffle.

vesche•5mo ago

Python's random.shuffle also uses Fisher-Yates

awesome_dude•5mo ago

I wonder how many other languages have it implemented as their shuffle algorithm

Sanzig•5mo ago

Do you? Modern CPUs have had built-in TRNGs for years.

indigodaddy•5mo ago

Maybe they mean that computers can't replicate the imperfections/shortcomings that might define human/hand shuffling? That's kind of how I took it.

evrydayhustling•5mo ago

Those human imperfections likely decrease randomness - for example leaving cards that started adjacent more likely to remain adjacent han by strict chance.

harrall•5mo ago

They most definitely decrease randomness.

But I guess the article’s point is that human imperfections offset that with lower correlated failure modes.

dwattttt•5mo ago

Given

> an algorithm would randomly select an arrangement from the 52! possible decks. But no computer has enough memory to evaluate all of these possibilities

It sounds more like they don't understand how a computer can shuffle something.

mistercow•5mo ago

There’s an almost mystical belief among certain tech and science journalists that computers are bad at randomness, and it’s really bizarre, and in my opinion, pretty harmful.

ziofill•5mo ago

“In an ideal world, an algorithm would randomly select an arrangement from the 52! possible decks. But no computer has enough memory to evaluate all of these possibilities, and a perfect random number generator doesn’t yet exist. Therefore, developers generally rely on algorithms that simulate card shuffling.”

You’ve got to be kidding me. What’s wrong with sampling each card independently and uniformly (without replacement)?

RainyDayTmrw•5mo ago

If you do it correctly, you've reinvented Fisher-Yates[1]. If you do it wrong, you've reinvented this unnamed, broken algorithm[2], instead.

But the issue in the article isn't application of pseudorandom numbers. It's seeding the generator.

[1]: https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle [2]: https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#N...

alexey-salmin•5mo ago

> In the late 1990s the development platform ASF Software supplied several online poker providers, such as Planet Poker, with card-shuffling algorithms. The platform even posted the algorithm on its website as proof that the game was reliably programmed.

What amazes me is the level of overconfidence the developers of the broken algorithm had to post it online. I mean it's not that the probability theory was a novel and unexplored field at the time.

caminante•5mo ago

Not sure "overconfidence" applies as you might be stretching the author's unfounded narrative.

This is more impressive than the alternatives:

1. Security through obscurity.

2. Increased financial liability due to #1.

alexey-salmin•5mo ago

Imagine you proudly present to the public your obviously flawed version of the algorithm even though the correct version is known for decades. If only you've read a single book on the topic.

If that's not overconfidence then it's hard to find what is.

caminante•5mo ago

You're just restating your initial claim and not addressing the issue I raised with the latter.

alexey-salmin•5mo ago

What is the issue? Not at all clear from your comment. You're saying above it's better than security by obscurity but it's beside the point.

caminante•5mo ago

> but it's beside the point

Why is it beside the point?

You haven't established their intent for gross negligence and give no charity to the fact this was 30 years ago (pre-Wikipedia and the search breadth we have today). Since then, people have continued to expose severe RNG design flaws in other systems designed by very smart people. It happens...

alexey-salmin•5mo ago

30 years ago it wasn't dark ages. Wikipedia didn't exist but books on probability theory and statistics did.

When you do a shuffling algorithm in a sensitive context (money or security), you have prove that it returns all the possible permutations with equal probability plus put lower bounds on the amount of entropy you need from the underlying RNG. If you're unable to prove it, you shouldn't move forward with the algorithm. Any irregularities in the output distribution can be exploited. This is textbook knowledge pioneered in early encryption works and perfected by the time WWII ended. Evidently the effort to prove correctness was never made.

Now the original article can indeed misrepresent or omit important facts. I'm definitely open to reconsider my conclusion if more facts become available. However "there was no Wikipedia" isn't one of them, it doesn't count as an excuse not to do your job properly.

If it turned out, for example, that "ASF Software" wasn't even aware that their shuffling algorithm was used to shuffle cards and just shipped it along with 200 other basic algorithms as a some sort of "better standard library", this would change the situation. However from the quick googling it seems that their product wasn't a standard library, it was specifically software for Texas Hold'em. This is a "you had one job" kind of situation.

> Since then, people have continued to expose severe RNG design flaws in other systems designed by very smart people. It happens...

Absolutely, but we're not talking frontiers of Computer Science here.

* If you seed your RNG with at most 86 million unique values, you get at most 86 million unique random sequences out of it.

* If your code should have M possible equiprobable outcomes, it has N equiprobable outcomes, and N doesn't divide M, you're in trouble.

caminante•5mo ago

> 30 years ago it wasn't dark ages...

I didn't say or imply books didn't exist. You can't credibly say it was as readily available, and I promise you that people are still making these mistakes, today.

> When you do a shuffling algorithm in a sensitive context (money or security), you have prove that it returns all the possible...If you're unable to prove it, you shouldn't move forward with the algorithm.

Ideally, of course! This is a really high standard that I'm afraid isn't enforced in a lot of commercial or even sensitive applications. 86 million permutations is probably good enough and even if someone was clever enough to synch clocks and narrow to 200k permutations, then I'm not convinced there was actually any harm.

Do you have any proof of harm?

And there are plenty of smart people in the 90s and beyond not realizing that relying a system clock to seed values is attackable. These guys, to their credit, patched their system by openly providing their algorithms.

Even if their clients had been harmed, they'd published the algorithm so that their "sophisticated" clients could audit the algorithm.

> I'm definitely open to reconsider my conclusion if more facts become available.

This is circular as you're taking the article's narrative at face value without getting any primary sources confirming gross negligence or "arrogance" as you imply.

alexey-salmin•5mo ago

> Ideally, of course! This is a really high standard that I'm afraid isn't enforced in a lot of commercial or even sensitive applications. 86 million permutations is probably good enough and even if someone was clever enough to synch clocks and narrow to 200k permutations, then I'm not convinced there was actually any harm.

Of course not, this is ridiculous. If your job is to shuffle the deck, shuffle it well. It's like doing a 80/20 coinflip and arguing that 50/50 is a "really high standard". And that for a company that sells bet-money-on-coinflips software.

If you don't know how to do it well -- read a book or use std::random_shuffle. Somehow Stepanov was able to do it right (assuming a good RandomNumberGenerator input) from the first try in 1993, without Wikipedia poor guy. And this wasn't even his main job, random_shuffle was one of a dozens of algorithms he envisioned and implemented for the STL.

> This is circular as you're taking the article's narrative at face value without getting any primary sources confirming gross negligence or "arrogance" as you imply.

I did some quick research and it seems that ASF Software had indeed developed the Planet Poker online platform. Which comes down to failing at your main job, I don't really see what other evidence you expect here?

I strongly believe that people in general and software engineers in particular should be held up to high standards. Finding excuses for how school-level math is too hard for them is condescending. It is disrespectful to the very people you're talking about.

If you say they couldn't even understand that N^N is not divisible by N! you basically say that they're mentally challenged. I on the contrary say that they most certainly would've been able to understand it if they made an effort -- which they didn't. So negligence.

alexey-salmin•5mo ago

UPDATE. I think I should also address this:

> if someone was clever enough to synch clocks and narrow to 200k permutations, then I'm not convinced there was actually any harm.

I don't think you understand the situation at all. In Hold'em in the end you see 7 cards: 2 in your hand and 5 on the table. That's 52x51x...x46 = 674B different sequences of open cards.

This means that by the time you see these cards you can know exactly which of the 200k permutations the engine had chosen for this hand. There's only one that precisely matches one of the 674 billions possible open cards combination that you observe.

In fact, by the time you see the flop (2+3 open cards, 311M variants), you know everyone else's cards.

caminante•4mo ago

I assume English is your second language.

Nobody's arguing that having a synched clock would NOT make them an advantage player.

You left out the part where I asked for proof anyone had exploited it (harm) and to what significance (how much harm). This actually matters in commercial and practical terms. Otherwise, you're really pressed to claim any real damages.

Meanwhile, and for hopefully the last time, it appears you're holding them to this idealist standard -- similar to an Italian reacting to someone snapping spaghetti noodles in half before throwing them into boiling water.

alexey-salmin•4mo ago

You still don't get it.

No need to synchronize the clock. The date alone is enough to guess hands of everyone at the table and turn and river, right after you see the flop.

That's as big of a hole as it can possibly get. That's enough to establish incompetence and/or gross negligence of the authors. Whether the hole was exploited is immaterial to the question.

caminante•4mo ago

"the date alone"

You're apparently hallucinating articles outside HN.

FTA:

>Simply syncing up their own program to the system clock reduced the possibilities to a mere 200,000 potential decks that the algorithm could generate.

alexey-salmin•4mo ago

> For another, the system ties its number generation to the number of seconds that have passed since midnight, resetting once each day, which further limits the possible random values. Only about 86 million arrangements could be generated this way, the Reliable Software Technologies team discovered.

86 million is much less than 300 million possible combinations you can see after flop. This means after the flop you know which exact shuffle was used (with a few statistically unlikely collisions where you may have 2 or 3 options).

caminante•4mo ago

Dude, this is so weird, and you continue to act in bad faith.

You need to specify UNIX 'date' as your intent as that phrasing wasn't used in the article.

It's also splitting hairs to say going from 300 million to 80 million is "much less" when that's not even the point of contention. Further to why you're splitting hairs, here's an actual research article [0] where the researchers point out that you needed the synched clock (not just the sysdate) to exploit it with hardware readily available at the time of the exploit, using Pentium 400s.

> That's enough to establish incompetence and/or gross negligence of the authors.

Going back to this claim, I really don't think you know what this term of art means. Ask a legal colleague/friend what they think is the criteria for "gross."

[0] https://web.archive.org/web/20140104095330/http:/www.cigital...

alexey-salmin•4mo ago

> You need to specify UNIX 'date' as your intent as that phrasing wasn't used in the article.

No, I didn't mean unix date, I mean literally date.

I can see two interpretations of the phrasing in the article. Either you have 86M shuffles per day (in this case knowing the date would benefit you) or you have 86M shuffles period (in this case even the date isn't necessary, you already have the totality of information). In both cases we can consider the problem of solving the game with 86M shuffles.

Syncing clocks is needed to enumerate all possible shuffles in real time on a 1999 PC, which is what the paper demonstrates. Doing this in realtime for 86M combinations wouldn't have been possible back then. However building a 1 Gb index file and making a HDD lookup in realtime was absolutely possible on very modest 1999 hardware, you can write such a program in a couple of hours.

Knowing the shuffle with three more rounds of betting to go represents a completely broken poker game, not just some minor biases in outcomes.

I have absolutely no idea what hairsplitting you are talking about, let alone bad faith discussions. 86M combinations is such a little number that you can analyse all of them and solve the game even on 1999 hardware. It's a fact, not a matter of opinion or idealistic standards. If you can just kindly acknowledge this fact, no further discussion will be necessary.

caminante•4mo ago

> No, I didn't mean unix date, I mean literally date.

Then, your entire comment is predicated on a mis-quote you emphasized. The article said "seconds," not day of the month. You need to work on clarity, if that was your intent.

> I have absolutely no idea what hairsplitting you are talking about

Either you're lying to me or yourself.

See discussion about "negligence" that you conveniently ignore. Meanwhile, you're tilting at windmills as you keep insinuating someone is arguing against you on the point of algorithmic flaws.

alexey-salmin•4mo ago

I'm sorry, but you simply don't understand how the exploit works. You don't need to guess the seed of the RNG, you don't need to synchronize anything, none of that matters.

There is such a low number of possible seeds that you can try them all, enumerate all possible shuffles, then check which one of them you've got. Then you know everyone's cards. This is why the game is fatally broken. That's it.

Call it negligence or not, I don't care much. But I'm amused by the fact that you fail to understand a basic combinatorics problem.

caminante•4mo ago

Yeah yeah...more ad hominems and strawmen.

> Call it negligence or not, I don't care much.

More bad faith.

alexey-salmin•4mo ago

I have no intention of building straw men or bad faith arguments. If I assume good faith on your part (which becomes increasingly hard), the only explanation to your position above is some sort misunderstanding of the combinatorial problem.

So, very simple and good-faith-not-a-straw-man question: do you agree that with 86M possible seeds you can guess all the cards with high probability after you see the flop? No clock synchronization necessary or any other tricks, just direct precomputation of all possible shuffles.

Yes or no would suffice.

caminante•4mo ago

Not trying to be mean. But, I've already written you off, sorry.

Please go poll some English speaking friends and someone with experience in commercial law whether you're making sense.

You're also welcome to blow through time trying to prove whether 86M brute force approach is deliverable in game time with HW of the day, when the researchers expressly pointed out you can approximate the clock based on a couple hands and narrow to 200k * [+/- units of time] sorts.

alexey-salmin•4mo ago

Ok, so you refuse to answer even the most straightforward question. I guess you can reflect in your free time why it feels like people are attacking straw men while talking to you.

In this case this happened because after many messages I'm still completely lost as to what is your point exactly, and you refuse to answer simple questions to clarify it. All I know is you're angry and apparently disagreeing something I said.

Good day then.

48terry•5mo ago

...They posted their algorithm as a way to prove it was reliable. Someone pointed out it wasn't reliable. They revised the algorithm. What's the problem here?

alexey-salmin•5mo ago

They were selling this algorithm for money and evidently didn't use any of that money to hire a single statistician to validate it. The mistake they've made isn't obscure and doesn't require a thousand pairs of eyes to catch, just one.

Imagine paying to a professional plumber, he installs the toilet upside down and then posts the photos online for the community to check his work.

woopsn•5mo ago

I guess this PRNG has recurrence period much less than 52! (or wasn't seeded appropriately), so that only sampled a fraction of the distribution

"... the system ties its number generation to the number of seconds that have passed since midnight, resetting once each day, which further limits the possible random values. Only about 86 million arrangements could be generated this way,"

mmmpetrichor•5mo ago

This became a hot discussion issue in magic the gathering online (MTGO). I always felt my shuffles in MTGO felt different somehow than my offline paper shuffles. It's hard to know for sure when its all closed source. I know a lot of people were suspicious though.

IncreasePosts•5mo ago

Closed source doesn't need to stop you - even if a server was dealing out the cards you can still do a randomness test on them once you gather enough deals, but you would probably need millions of hands of data to do the analysis (see: dieharder tests)

Fezzik•5mo ago

MTGA uses a method called ‘smoothing’ for initial hand selection: https://mtgazone.com/mtg-arenas-opening-hand-algorithm-and-s...

I assume MTGO uses something similar and I swear I had a great analytical article bookmarked regarding MTGO shuffles but I can’t find it.

malnourish•5mo ago

Modo uses FY and fits not smooth draws. Arena only smooths in best of 1 matches.

Sophira•5mo ago

One thing that I wonder about is that even with Fisher-Yates, a PRNG relies on a seed value. Assuming that this seed is going to be a 64-bit value, that seed value can only go up to a maximum of 2^64-1 = 18,446,744,073,709,551,615.000, which is still less than 52!.

I know that modern operating systems use an entropy pool. Does this mean that the PRNG is re-seeded for every random number generated, thus making this a non-issue? Or does it just take from this entropy pool once?

adastra22•5mo ago

No cryptographically secure PRNG has a 64-bit state. If you are not using a CSRNG in an adversarial environment, you have bigger problems.

kristianp•5mo ago

Link to the discoverers of the problem: https://web.archive.org/web/20140104095330/http:/www.cigital...

It actually has some depth into the multiple problems of the algorithm published by the gaming company.

thedigitalone•5mo ago

No paywall mirror: https://archive.ph/ONQgF

dragontamer•5mo ago

RDRAND and RDSEED are both using quantum principles (aka: heat and temperature / truly quantumly random noise at the microscopic level in the CPU's transistors) to generate random numbers.

Well... a seed at least. And then they are expanded using AES encryption IIRC (which "shouldn't" be breakable, and even if it were breakable it'd probably be very difficult to follow). I think RDSEED takes hundreds (or nearly a thousand) cycles to complete, but we're still talking millions-of-bits of entropy per second. More than enough to shuffle a deck even if you're taking a fresh RDSEED every single card.

Every few months, it feels like "someone effed up RNG" becomes an article. But in practice, RDRAND / RDSEED are the primitives you need. And you should be getting that for free with Linux's /dev/urandom on modern platforms.

----------

I think RDSEED / RDRAND cannot be "proven secure" because of all the VMs we are running in practice though. So its something you need to be running on physical hardware to be 100% sure of security. So its still harder than it looks.

But its not "impossible" or anything. Just work to cover all the little issues that could go wrong. After all, these RDRAND/RDSEED instructions were created so that we can send our credit card numbers securely across the internet. They're solid because they _HAVE_ to be solid. And if anyone figures out a problem with these instructions, virtually everyone in the cryptographic community will be notified of it immediately.

---------

EDIT: I should probably add that using the shot-noise found in a pn-junction (be it a diode or npn transistor) is a fun student-level EE project if anyone wants to actually play with the principles here.

You are basically applying an amplifier of some kind (be it 3x inverters, or an OpAmp, or another NPN transistor) to a known quantum-source of noise. Reverse-avalanche noise from a Zener Diode is often chosen but there's many, many sources of true white-noise that you could amplify.

klodolph•5mo ago

What I’m impressed by is getting noise of a consistent level out of a circuit. That’s a nice second layer of difficulty to the “make some noise” EE project.

blobbers•5mo ago

I think the noise has to be random... so its inherently inconsistent ;) .. maybe?

dragontamer•5mo ago

Its easy to think if you can see it in both frequency and time domains.

So the fourier-transform of white noise is still.... white noise. Random is random as you say. But this has implications. That means the "wattage" of noise (ie: Voltage * Current == Watts aka its power) is a somewhat predictable value. If you have 0.5 Watts of noise, it will be 0.5 Watts of noise in the frequency-domain (after a fourier transform, across all frequencies).

The hard part of amplification is keeping it consistent across all specifications. I assume the previous post was talking about keeping white noise (which is "flat" across all frequency domains), truly flat. IE: It means your OpAmps (or whatever other amplifer you use) CANNOT distort the value.

Which is still student level (you cannot be a good EE / Analog engineer if you're carelessly introducing distortions). Any distortion of white-noise is easily seen because your noise profile weakens over frequency (or strengthens over frequency), rather than being consistent.

DHRicoF•5mo ago

But most common noises are not white. you had to decolor it before.

dragontamer•5mo ago

Alternatively, you can choose a proven source of white noise.

Such as the reverse-bias shot and/or avalanche noise at the pn junction of a reverse bias'ed Zener Diode. Which is white-noise into the hundreds-of-MHz. Maybe not good enough for RDSEED, but certainly good enough and fast-enough for most hobbyist projects who are experimenting with this for the first time.

kittikitti•5mo ago

These are methods to generate cryptographically secure pseudo random numbers using a truly random seed.

dragontamer•5mo ago

RDSEED returns a truly random seed (as long as you aren't getting hacked while running in a VM or something)

RDRAND is allowed to stretch the true random seed a little bit inside of hardware, which is sufficient for most purposes.

thijsr•5mo ago

When you can modify the microcode of a CPU, you can modify the behaviour of the RDRAND/RDSEED instructions. For example, using EntrySign [1] on AMD, you can make RDRAND to always return 4 (chosen by a fair dice roll, guaranteed to be random)

[1] https://bughunters.google.com/blog/5424842357473280/zen-and-...

dragontamer•5mo ago

I don't mean to say that RDSEED is sufficient for security. But a "correctly implemented and properly secured" RDSEED is indeed, quantum random.

IE: While not "all" RDSEED implementations (ie: microcode vulnerabilities, virtual machine emulation, etc. etc.) are correct... it is possible to build a true RNG for cryptographic-level security with "correct" RDSEED implementations.

------

This is an important factoid because a lot of people still think you need geiger counters and/or crazy radio antenna to find sufficient sources of true entropy. Nope!! The easiest source of true quantum entropy is heat, and that's inside of every chip. A good implementation can tap into that heat and provide perfect randomness.

Just yeah: microcode vulnerabilities, VM vulnerabilities, etc. etc. There's a whole line of other stuff you also need to keep secure. But those are "Tractable" problems and within the skills of a typical IT Team / Programming team. The overall correct strategy is that... I guess "pn-junction shot noise" is a sufficient source of randomness. And that exists in every single transistor of your ~billion transistor chips/CPUs. You do need to build out the correct amplifiers to see this noise but that's called RDSEED in practice.

pmarreck•5mo ago

Why not just simulate a real shuffle?

Just "cut" in a random location (rotate the deck a random amount), then split the deck roughly in half (add a bit of random variation), then "flip them together" back to front by alternately taking 1 or 2 (randomly, add a small chance of 3, so maybe 50% 1, 40% 2 and 10% 3) from each side till there are no cards left to shuffle. Then repeat 8 times or so (there's a certain minimum number of times that ensures good randomness)

kerkeslager•5mo ago

> Why not just simulate a real shuffle?

If you are asking why your proposed shuffling method is insecure: I don't know, and that's why I would never use that.

Asking "why not do X?" is entirely not paranoid enough for security. If you want to propose an algorithm, start with trying to prove the security claims of the algorithm. In this case, you'd need to prove that your algorithm creates a permutation that is indistinguishable from random. If you can't prove it, it's highly probable that your algorithm doesn't create a random permutation and someone will figure out how to break it.

I'll point out that we already have proven shuffling algorithms[1] and they're obviously faster than what you've proposed. So the simple answer to your question is "Because it's unproven and slower than proven algorithms."

[1] https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle

pmarreck•5mo ago

From what I understand, the quality of the randomness of Fisher-Yates depends entirely on the quality of the random source AND that you didn't bias it by using modulo with a non-evenly-dividing divisor. It actually says that right in the article.

My method may not suffer as much from those drawbacks, but you're right, without testing it thoroughly, there's no way to know and it should not be relied upon instead of F-Y.

EDIT: My intuition was correct more or less. Doing it the way I described serves to "smear" any bias across the deck. Fascinating chatgpt convo on it: https://chatgpt.com/share/68bd103f-9188-8004-8cbc-86693a0d87...

Turns out that an even easier way is just to assign 128 bits of randomness to each card and sort them by that. Degrades gracefully from biased sources, apparently.

kerkeslager•5mo ago

> From what I understand, the quality of the randomness of Fisher-Yates depends entirely on the quality of the random source AND that you didn't bias it by using modulo with a non-evenly-dividing divisor. It actually says that right in the article.

Yes.

Pretty much every modern language ships with a secure PRNG (which probably just calls /dev/urandom). A poker site probably has enough throughput to want to not block while waiting for /dev/urandom to build up entropy, so they might do something faster, but /dev/urandom is probably secure, it just might be a slower than a big poker site needs.

The non-evenly-diving divisor thing is a bit trickier, which is why standard libraries implement Fisher-Yates for you. But the solution is basically:

Using your PRNG, generate the number of bits you need. So if you need a number 0-51, generate 6 bits. 6 bits can hold 0-63. If you get a number in the range 52-63, discard that and generate a new 6-bit number with the PRNG.

If you need a number in an awkward range like 0-35, you'll need to discard a lot of numbers, but keep in mind this may still be faster than modular division which is pretty dang slow.

> My method may not suffer as much from those drawbacks, but you're right, without testing it thoroughly, there's no way to know and it should not be relied upon instead of F-Y.

That's very much not what I said.

"Testing it thoroughly" is not adequate. Start by proving the algorithm is correct. If you don't have a proof the algorithm works there's no reason to even start implementing it so there's nothing to test.

> EDIT: My intuition was correct more or less. Doing it the way I described serves to "smear" any bias across the deck. Fascinating chatgpt convo on it: https://chatgpt.com/share/68bd103f-9188-8004-8cbc-86693a0d87...

Jesus Christ, no. If you still believe anything ChatGPT says then security is not the field for you.

indigodaddy•5mo ago

Lol, I laughed at his confident usage of "turns out" and "apparently" after his conclusive conversation with his chatgpt expert..

pmarreck•5mo ago

I never claimed conclusivity; words like "apparently" were supposed to suggest that, which is why I didn't use "obviously," "definitely" or "conclusively".

Also, maybe be less of an asshole next time. People are allowed to be wrong, that is the whole point of a discussion, and ridicule is not a counterargument.

/eyeroll

indigodaddy•5mo ago

I don't think you're either wrong or right (in fact my intuition sides with yours I think) because I definitely know way less than you about the subject in general, or probably also less than almost anyone here on HN-- just the trust of swaths of people in the authoritativeness of LLM outputs does generally amuse me. I'm sure we'll get there one day, but we're not there yet. I clearly offended you, and I apologize.

kerkeslager•5mo ago

Sure, it's okay to be wrong, but it gets significantly less okay when you just keep doubling down on being wrong and refuse to admit you are wrong.

Unless I've missed something, you still seem to be under the very wrong impression that you've come up with a secure shuffling algorithm. You simply have not come up with a secure shuffling algorithm. The minimum viable evidence of security for an algorithm is mathematical proof and you don't have that, and you don't get to sit a the security table until you do. Put bluntly: mathematical proof or GTFO.

Yes, I'm gatekeeping, and there's a very good reason for that: we know from tens of thousands of bad attempts at secure algorithms that if there isn't a proof of security it very likely is not secure, with disastrous effects. It is not a kindness to let this sort of wrong information spread, and it's not assholish to be clear that it is wrong.

EDIT: Even if you want to add "probably" or some other qualifying word to your claims, "This shuffling algorithm is probably secure" is still wrong. No, that shuffling algorithm is probably not secure. I am sorry to be a hardass here, but the idea that this is an effective approach to security needs to be deleted from your mind.

pmarreck•5mo ago

And if you believe nothing ChatGPT says then you simply suffer from the opposite bias. The dilemma is that it is mostly right, most of the time, isn't it? Almost like... people? I mean, one person responded with an error that ChatGPT almost certainly wouldn't have made: https://news.ycombinator.com/item?id=45158430

If ChatGPT says things that are empirically provable, then it's not wrong, is it?

tripletao•5mo ago

"Mostly right" is the most insidious option, because the frequent true statements or implications will lull you into believing the infrequent false ones.

For Fisher-Yates, if the RNG is good then the output permutation is independent of the input permutation. That's not true for your proposed algorithm, and neither you nor the the LLM has analyzed to determine how quickly the sensitivity to the input permutation decays. That sensitivity will have complicated structure, so it's a nontrivial problem in cryptanalysis.

The problem where the low bits of old, not cryptographically secure PRNGs were particularly non-random is real. That would be a problem for any algorithm using it though--imagine if your riffle took rand() % 2. The resolution (use the high bits, not the low bits) is the same for any algorithm. It's possible that your proposed algorithm is less sensitive because the much greater number of RNG calls makes it somehow average out, but neither you nor the LLM has analyzed that, because the residual structure is again a cryptanalysis problem.

Any potentially adversarial application should use a CSPRNG, moving the cryptographic security into that existing, well-studied block. So the problem you're trying to solve doesn't exist in practice. The LLM can probably be prompted to tell you that; but since you stated that the new shuffling algorithm was your idea, it worked to find reasons it was a good one. I don't think this is a good use of LLMs, but if you're going to then you have to avoid any suggestion that the new idea is yours, to avoid triggering the sycophancy.

kerkeslager•5mo ago

> And if you believe nothing ChatGPT says then you simply suffer from the opposite bias.

No, you're making a logical error. It's not that I believe the opposite of what ChatGPT says, it's that I don't believe what ChatGPT says based on ChatGPT saying it alone. If ChatGPT says your algorithm is right, it's not that I automatically believe your algorithm is wrong, it's that I believe I still don't know it's right because ChatGPT is shit at being right.

> The dilemma is that it is mostly right, most of the time, isn't it? Almost like... people?

I mean, yeah. Which is why I don't believe things just because some random person says them, either.

I tend to assign confidence intervals to beliefs in my head, and even and expert saying something doesn't make me super confident in it. I tend to believe things strongly if I can understand the hard evidence for them, not because any person, even an expert, says it.

With security, we have tens of thousands of examples of unproven security assertions being proven wrong later with disastrous effect, so the standard of proof is very high, and ChatGPT saying it is absolutely nowhere near close enough to proof. The only acceptable proof with a security algorithm is mathematical proof.

> If ChatGPT says things that are empirically provable, then it's not wrong, is it?

ChatGPT also says things that are empirically disprovable, so basically you always have to go to empirical proof to tell whether what ChatGPT has said is true, making ChatGPT next-to-useless if accuracy is at all a concern. Just go to empirical proof and skip the pointless ChatGPT step.

And accuracy really, really is a very big concern in secure random shuffling algorithms, so you really really need to empirically prove that they work which you really really have not done because you stopped thinking when you trusted ChatGPT. Your shuffling algorithm is not proven, and very likely is not secure.

oneshtein•5mo ago

> Turns out that an even easier way is just to assign 128 bits of randomness

52! is roughly 2^226.

You cannot address all 2^226 positions with a 2^128 address generated from 2^64 seed.

rcxdude•5mo ago

it's 128 bits per card. That's vastly more than 226 bits.

amluto•5mo ago

You could use something like 12 bits of randomness per card (a very rough approximation of the log_2(n^2)) to get the probability that you reuse a number down to a manageable level, check if you’ve reused a number (which is basically free once you’ve sorted), and then repeat the whole process if you reused a number.

Or you could have a lazily calculated infinite precision random number per card and use more like 6 bits per card on expectation. Other than using more (and annoyingly variable) memory, this may well be faster than a properly unbiased Fisher-Yates.

Or you could assign a few bits per card, sort, and then recurse on each group of cards that sorted into the same bucket.

In summary, there are lots of genuinely unbiased solutions (assuming a perfect binary RNG), and they all boil down to something roughly equivalent to rejection sampling.

pmarreck•5mo ago

With all due respect, I don't think you understand. The 128 bits are just to sort the 52 cards into a random-enough order, and they have sufficient entropy and space for that.

We're not interested in sorting all possible decks, just all possible cards (52) into 1 unique randomized deck. The more bits we assign to these, the more any error or bias in the source goes to nil. (In contrast, consider the use of only 4 bits.)

Since everyone's bashing my use of ChatGPT, I don't think ChatGPT5 would have made that categorical error.

oneshtein•5mo ago

2^128*52 is 6656 bits, or 30x more than 2^226, so it will work IF those bits are independent. If CSPRNG is used, such as AES128, it will use 2^128 seed, which is not enough to cover all 2^226 permutations. If PRNG sequence is reused to generate more shuffles, then the next shuffle can be predicted from the previous one.

CSPRNG AES256 with 2^256 random seed is safe.

CSPRNG AES128 with 2^128 (or less) random seed is not safe.

tripletao•5mo ago

> If PRNG sequence is reused to generate more shuffles, then the next shuffle can be predicted from the previous one.

That's theoretically true by a counting argument, but for a CSPRNG the only way to compute that next shuffle costs about as much computationally as brute force. 128 bits is enough to make that infeasible, though 256 may be preferable for the usual abundance of caution.

The number of permutations doesn't matter here, since an attack becomes computationally infeasible before it becomes counting infeasible. It's also possible for a practical attack to exist even when the output can't be predicted exactly, since statistical information is still valuable.

kerkeslager•5mo ago

> Turns out that an even easier way is just to assign 128 bits of randomness to each card and sort them by that. Degrades gracefully from biased sources, apparently.

This was edited in after my original response, so I didn't respond to it previously.

I won't generally do the work to find the attacks your bad security claims are susceptible to, but this one is trivial.

This does not degrade gracefully in any way. If your biased source of randomness generates the same numbers more than once, the items in the list are assigned the same number, so that sorting creates a bias toward the pre-shuffle ordering. This sort of collision is more common than one would intuit due to the birthday paradox[1] which is why this class of attacks is known as birthday attacks. In short, the birthday paradox amplifies problems of biased randomness sources rather than degrading gracefully.

In fact, this bias toward the pre-shuffle ordering still exists even with a perfect RNG, it's just going to be extremely unusual for a collision to occur in the space of 128 bits. But Fisher-Yates doesn't have this problem; why use an algorithm that we can easily prove is biased, even if the bias is negligible with proper PRNG?

Also, I really have to question your claim that this is even easier than Fisher-Yates. Fisher-Yates is not particularly complicated. Did you even bother to learn how Fisher-Yates works before assuming you could come up with a simpler solution than the industry standard?

[1] https://en.wikipedia.org/wiki/Birthday_problem

EDIT: Another problem with all the algorithms you've proposed is that they can't be partially applied. A hand of 6max Texas Hold'em, the most common online poker variant, requires only 2x6 + 5 = 17 cards, and by partially applying Fisher-Yates you can select just the first 17 cards in the shuffled deck without having to know or generate the other 35. This will be faster and deplete your RNG slower, further decreasing the chance of bias.

amluto•5mo ago

You can:

(a) Use Fischer-Yates, which is fast and, if correctly implemented, unbiased, or

(b) Come up with an ad-hoc scheme like this, do some fancy math to bound its total variation distance (or, worse, some weaker concept of distance), convince yourself that the distance to the uniform distribution is acceptable, and end up with a slower algorithm when you’re done.

wzdd•5mo ago

The artice links to the paper which discusses the issue. The paper identifies two problems: a biased swapping algorithm, and a PRNG with easy-to-guess state.

For the first problem, a very simple "for each card, pick a number between 1 and 52 and swap the card with that number (or do nothing if it's the same card)" is proposed to eliminate bias. This seems pretty simple to verify.

For the second problem, there is no workaround except to use a better RNG.

So in the context of the paper, the reason your algorithm wouldn't have worked is because the programmers seeded their PRNG with the time of day. In the context of shuffling more generally, I don't know but it seems that there are even simpler algorithms.

That paper: https://web.archive.org/web/20140104095330/http:/www.cigital...

runeblaze•5mo ago

> Today many online poker sites use the Fisher–Yates algorithm, also called the Knuth shuffle (which sounds delightfully like a dance). It’s easy to implement and delivers satisfactory results.

Assuming CSPRNG and fisher yates, why is it only "satisfactory"...? What's better...?

eterm•5mo ago

Satsifactory in this context is good, not bad.

We live in a euphemistic world where "satisfactory" is presented to failures to not hurt their feelings, but the word also and originally means it's good enough, i.e. delivers an unbiased shuffle.

pfg_•5mo ago

But it's not just good enough, it's optimal. It is equivalent to picking a random deck from the set of all possible decks assuming your random source is good. More random than a real shuffle.

VanTheBrand•5mo ago

Right and that’s what satisfactory means, the condition was satisfied.

insaider•5mo ago

Lame clickbait title. TLDR; shuffling algorithm's have improved since the 90s...

defrost•5mo ago

Really?

  The original Fisher–Yates shuffle was published in 1938 by Ronald Fisher and Frank Yates in their book Statistical tables for biological, agricultural and medical research.

   The modern version of the Fisher–Yates shuffle, designed for computer use, was introduced by Richard Durstenfeld in 1964. It is also known as the Knuth shuffle after Donald Knuth who popularized it in Volume Two of The Art of Computer Programming (1969)

~ https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle

Picking a bad algorithm, as they did here in this example from the 1990's is timeless.

Good shuffle algorithms existed then, and had existed since the 1930s.

rekttrader•5mo ago

When I did it back in the day for a poker site we also had the constraint of never storing the positions of shuffled cards for security reasons. We always sorted the deck and when it came to dealing I developed a shuffle server that got noise from a taped up webcams cmos sensor, when a player needed a card it polled the server which was just outputting random positions from 0 to 51 continuously, if the card position was already dealt, it re requested a new one, entropy was gotten from player decision time and network speed. It was a fun thing to build.

monero-xmr•5mo ago

Online poker died because having 1 other person at the table sharing cards surreptitiously drastically increases the odds in your favor, which ruins cash games. Let alone AI which has mostly solved the game if you have a bot making your decisions.

Poker must be played in person, otherwise it’s a dead game

bzhang255•5mo ago

None of this is wrong, but anecdotally, I will say that there are still human beings playing poker online, and a better human being can still win in the long run. (Though, live poker is much more fun)

kittikitti•5mo ago

I've run TRNG's to create truly random numbers that can be audited and certified by the NSF. Hardware like HSM's and TPM's were convenient sources. I just have to wait until a Big Tech gatekeeper propagates this method. Who knows? Maybe even the temperature parameter in language models could find some value in it.

procaryote•5mo ago

A lot of problems are pretty hard if you insist on doing zero research

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

Why there is no official statement from Substack about the data leak

Effects of Zepbound on Stool Quality

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

Kessler Syndrome Has Started [video]

Complex Heterodynes Explained

EVs Are a Failed Experiment

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

CCC (Claude's C Compiler) on Compiler Explorer

Homeland Security Spying on Reddit Users

Actors with Tokio (2021)

Can graph neural networks for biology realistically run on edge devices?

Deeper into the shareing of one air conditioner for 2 rooms

Weatherman introduces fruit-based authentication system to combat deep fakes

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

A Curated List of ML System Design Case Studies

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

Open Problems in Mechanistic Interpretability

Bye Bye Humanity: The Potential AMOC Collapse

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

Digital Iris [video]

Essential CDN: The CDN that lets you do more than JavaScript

They Hijacked Our Tech [video]

Vouch

HRL Labs in Malibu laying off 1/3 of their workforce

Show HN: High-performance bidirectional list for React, React Native, and Vue

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

Why there is no official statement from Substack about the data leak

Effects of Zepbound on Stool Quality

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

Kessler Syndrome Has Started [video]

Complex Heterodynes Explained

EVs Are a Failed Experiment

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

CCC (Claude's C Compiler) on Compiler Explorer

Homeland Security Spying on Reddit Users

Actors with Tokio (2021)

Can graph neural networks for biology realistically run on edge devices?

Deeper into the shareing of one air conditioner for 2 rooms

Weatherman introduces fruit-based authentication system to combat deep fakes

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

A Curated List of ML System Design Case Studies

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

Open Problems in Mechanistic Interpretability

Bye Bye Humanity: The Potential AMOC Collapse

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

Digital Iris [video]

Essential CDN: The CDN that lets you do more than JavaScript

They Hijacked Our Tech [video]

Vouch

HRL Labs in Malibu laying off 1/3 of their workforce

Show HN: High-performance bidirectional list for React, React Native, and Vue

The math of shuffling cards almost brought down an online poker empire

Comments