The Fisher-Yates shuffle is backward

https://possiblywrong.wordpress.com/2020/12/10/the-fisher-yates-shuffle-is-backward/

65•possiblywrong•1mo ago

Comments

robinhouston•1mo ago

That’s funny. I’ve always done it the forwards way. I didn’t even realise that wasn’t the usual way.

I suppose one of the benefits of having a poor memory is that one sometimes improves things in the course of rederiving them from an imperfect recollection.

furyofantares•1mo ago

Same, I've implemented it a number of times and always done it forward, and can't recall ever seeing it backwards. I've looked at the wikipedia page for it more than once too, which, as the article mentions, shows it backwards.

Maybe it's because it's so easy to prove to yourself that Fisher-Yates generates every possible combination with the same probability[1], and so forwards or backwards just doesn't register as relevant.

[1]This of course makes the a hefty assumption about the source of random numbers which is not true in the vast majority of cases where the algorithm is put into practice as PRNGs are typically what's used. For example if you use a PRNG with a 64 bit seed then you cannot possibly reach the vast majority of states for a 52 card deck; you need 226 bits of state for that to even be possible. And of course even if you are shuffling with fewer combinations than the PRNG state can represent, you will always have some (extremely slight) bias if the state does not express an integer multiple of the number of permutations of your array size.

furyofantares•1mo ago

On further inspection the one I'm used to is forward-mirrored, which is exactly the same as backward but the opposite direction.

shiandow•1mo ago

Huh I didn't know the backwards version was more common, it seems odd.

You could also call the last version the online version, as it will ensure the partial list is random at any point in time (and can be used for inputs with indeterminate length, or to extend a random list with new elements, sample k elements etc.)

Not too sure if the enumerate is necessary. I usually dislike using it just to have an index to play around with. A similar way of doing the same thing is:

    for x in source:
        a.append(x)
        i = random.randint(0, len(a))
        a[i], a[-1] = a[-1], a[i]

Which makes the intention a bit clearer. You could even avoid the swap entirely but you would need to handle the case where i is at the end of the list separately.

dooglius•1mo ago

> sample k elements

Not quite sure what you have in mind here, but you need reservoir sampling for this in order to make the selection uniformly random (which I assume is what's desired)

shiandow•1mo ago

You can just use this algorithm but ignore everything after the first k elements. The algorithm still works if you don't store anything beyond the first k elements but just pretend they are there.

lupire•1mo ago

enumerate() is just an awkward way to get len(a). In theory, you could somehow be in an environment where you have dynamically resizing arrays (vectors) that don't track their length internally. But in this case it's probably because OP doesn't have a firm grasp what's happening (which is why they wrote the blog post).

amluto•1mo ago

I find the backward version slightly more intuitive. Here’s why:

Suppose I want to uniformly randomly shuffle a deck of cards in a single pass. I stick the deck on the table and call it the non-shuffled pile. My goal is to move the deck, one card at a time, into the shuffled pile. First I need to select a card, uniformly at random, to be the bottom card of the new pile, and I move it over. Then I select another card, uniformly at random from the still non-shuffled cards, and put it on top of the bottom shuffled card. I repeat this until I’ve moved all the cards, so that each card in the shuffled pile is a uniform random selection from all of the cards it could have been. And that’s it.

One can think of this as random selection, whereas the “forward” version is like random insertion of a not-random card into a shuffled pile. And for whatever reason I tend to think of the selection version first.

dunham•1mo ago

For what it's worth, I think of the forward version as randomly selecting a card and putting it at the bottom of the new pile (0..i in the array).

pstuart•1mo ago

That's quite clean, thanks!

fn-mote•1mo ago

Very interesting article!

For me, the reason for reaching for the “backwards” version first is that it wasn’t as clear to me that the “forward” version makes a uniform distribution.

Even after reading the article.

I appreciated the comment here about inserting a card at a random location, but that also isn’t quite right, because you swap cards not insert. Nevertheless, that did it for me.

NooneAtAll3•1mo ago

one algo "creates" shuffled subarray and grows it - the other "chooses" random cards one at a time

both seem intuitive from individual perspectives

fanf2•1mo ago

There are actually four variants:

• loop counts downwards vs upwards

• the processed part of the array is a uniform sample of the whole array, or it is a segment that has been uniformly shuffled

Knuth described only the downwards sampling version, which is probably why it’s the most common.

The variants are compared quite well on wikipedia https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle

possiblywrong•1mo ago

It's interesting that the two forward versions of the algorithm were added to Wikipedia just a few months ago. (The OP article is from 2020.)

ogogmad•1mo ago

`forward_shuffle` is the (GROUP) INVERSE of `fisher_yates_shuffle`.

`mirror_shuffle` is the (GROUP) CONJUGATE of `fisher_yates_shuffle` by the cyclic permutation (n-1,n-2,...,1,0). In group theory, CONJUGATEs are like changes of coordinates. In the present application, they reverse the index labels.

The article said it, but it's worth distilling it.

PS: Oh, here's another link. Denote by "S!", or less formally, the factorial of a set S, to mean the set of permutations of S. Fisher-Yates is equivalent to a bijection between (S+1)! and S! × (S+1).

why-o-why•1mo ago

I'd prefer to see the NIST SP 800-22 results.

(that was a joke)

bluecalm•1mo ago

I guess the reason is that the most interesting part of the problem is implementing non-biased selection from 0 to n and once you have done it you just want to use the number so it's natural to choose from the beginning of the array and swap to the last position beyond that.

flobosg•1mo ago

(2020)

calmbonsai•1mo ago

Huh, I think because of my mental model of a physical card deck, I've always done it the "forwards" way, but had never characterized it as such.

dminor•1mo ago

A lot of backwards looping is a remnant of efficient loops in programming days of yore - you compare your iterator to 0 each time, which is slightly more efficient than comparing to another variable.

Tiny C Compiler

You Are Here

SectorC: A C Compiler in 512 bytes

The F Word

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Speed up responses with fast mode

LLMs as the new high level language

Software factories and the agentic moment

Hoot: Scheme on WebAssembly

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

GitBlack: Tracing America's Foundation

Stories from 25 Years of Software Development

FDA intends to take action against non-FDA-approved GLP-1 drugs

Show HN: A luma dependent chroma compression algorithm (image compression)

Al Lowe on model trains, funny deaths and working with Disney

First Proof

Vocal Guide – belt sing without killing yourself

I write games in C (yes, C) (2016)

Italy Railways Sabotaged

Start all of your commands with a comma (2009)

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Reinforcement Learning from Human Feedback

Selection rather than prediction

The AI boom is causing shortages everywhere else

72M Points of Interest

A Fresh Look at IBM 3270 Information Display System

Coding agents have replaced every framework I used

Unseen Footage of Atari Battlezone Arcade Cabinet Production

France's homegrown open source online office suite

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev