Use Bayes rule to mechanically solve probability riddles

https://cloud.disroot.org/s/Ec4xTMFDteTrFio

75•zaik•5d ago

Comments

kgwgk•2d ago

> You're told that at least one of them is a girl.

> Likelihood of at least one girl

What the “mechanism” requires is “likelihood of being told that at least one of them is a girl”.

Use Bayes rule to correctly solve probability riddles:

https://news.ycombinator.com/item?id=45056790

    p(both are girls | you're told at least one is a girl)
       = p(you're told at least one is a girl | both are girls) * p(both are girls) / (
            p(you're told at least one is a girl | both are girls) * p(both are girls)
            +
            p(you're told at least one is a girl | they aren't both girls) * p(they aren't both girls)
        )

The solution there assumes that p(you're told at least one is a girl | both are girls) = p(you're told at least one is a girl | they aren't both girls).

adammarples•2d ago

Assuming 100% likelihood of a truth telling quiz setter, that bfomes down to population level statistics of male vs female?

kgwgk•2d ago

This is not about someone choosing to lie, it’s about someone choosing what true thing to say.

If “you're told that at least one of them is a girl” was intended to mean “you ask whether at least one of them is a girl and you’re told that at least one of them is a girl” it was pretty easy to make it clear.

In that case it’s straightforward that p(you're told at least one is a girl | both are girls) = p(you're told at least one is a girl | only one is a girl) = 1 (with just the mild and reasonable assumption that you’re not being lied to.)

Edit: also the reasonable assumption that your question is unrelated to the number of girls - it’s easy to imagine settings where you asked because of something which may be correlated with the number of girls.

AnotherGoodName•2d ago

Pretty much all of them are like this and it's actually a terrible article. For those thinking about how can the likelihood of them being a girl can change consider this;

>A family has two children. You're told that at least one of them is a girl. What's the probability both are girls?

"The question writer took all sets of two child families and ruled out the bb case. Then they asked the exact question above" This is 1/3 chance - select gg from [gg,bg,gb]'

"The question writer came across a girl from a two child family, then they asked the exact question above". This is 1/2 chance - select gg from [gg, gg, bg, gb] with gg listed twice since there's two ways to select a girl from that set; ie. coming across a girl is twice as likely to occur from the gg case than it is either gb or bg.

I think that's the clearest wording to get the message across. Either way it's the exact same question but it reasonably has a completely different answer. There's no way to resolve this ambiguity with the question as written. Pretty much all questions here are like this.

Amusingly the Monty Hall problem in it's original form was written to avoid this ambiguity. There's a comment thread above on this. But you know what the author of the article linked here did? They reworded it and added ambiguity similar to the above. The Monty Hall problem needs to specifically state "the host opens a door that he knows does not contain the prize" or it's unanswerable. The author of the article removed this statement without awareness of its importance!

This sounds mean but honestly I really really think this article was written by someone with no education or knowledge on statistics. As in they broke perfectly reasonable questions and added ambiguity to the point they are not answerable as written in an article trying to demonstrate how easy stats is.

zaik•2d ago

I did not mention a lot of assumptions. For example my problem statement also does not answer the question if Monty always gives you a choice to switch or only when you picked in a certain way. There are quite a few possible variations, all of which change the answer: https://en.wikipedia.org/wiki/Monty_Hall_problem#Other_host_...

In the derivation of the likelihood the assumptions are clearly stated.

While my knowledge on statistics is certainly unsatisfactory, your assessment of the education I had is quite wrong. My final exam in order to obtain a Master's degree in mathematics is next week. One of the subjects I chose is Bayesian Statistics, so wish me luck.

AnotherGoodName•2d ago

I do wish you luck. Fwiw I'd always work forward from the question and talk to the ambiguities present especially when people may come across this and not realize. Stating the Bayesian probabilities of one possible interpretation and having that as your way of stating assumptions is a bit like working backwards from one possible answer.

As an example of how presenting this way can hurt; The link you had for the sisters paradox doesn't talk to the ambiguities of a question that is well known to have no answer - https://en.wikipedia.org/wiki/Boy_or_girl_paradox . This information spreads and people see it restate it. This leads to a false belief that the sisters paradox has no ambiguities and that the answer is clearly 1/3 (the wikipedia page on this has the correct statement that it's ambiguous and the answer can be 1/3 or 1/2 depending on one of two reasonable interpretations).

I think a far better article would talk to ambiguities of each of these. Not subtly stating assumptions. I'll also point out that these types of ambiguous questions are commonly used in DS interviews (I've worked in big tech for many years now). The expected response of a strong candidate is a discussion on the ambiguities. You'll usually be prompted "is there any other interpretation of this question that leads to a different result?". If people read this as their guide rather than the more detailed wikipedia articles etc they may be misled which is why i'm strongly negative on this article as written. I'd hope no one reads and doesn't realize the assumptions here since they are critical.

rtrgrd•2d ago

Might be hug of death but the load times are horrifically slow.

fuomag9•2d ago

For me nextcloud has always been worryingly slow even on my instance

FabHK•2d ago

Is the markdown rendered once on the server and stored as HTML? Then why is it slow? Or is it rendered for each client, or rendered in the client?

Shorel•2d ago

This Monty Hall problem was asked to Marilyn vos Savant, a woman with an extremely high IQ, who solved it correctly, and many readers of her column, including PhD and mathematicians, declared her solution wrong.

Then careful analysis proved her correct.

https://en.wikipedia.org/wiki/Monty_Hall_problem#Savant_and_...

FabHK•2d ago

To be fair, I think this problem needs to be formulated very carefully and specifically [0], because the "correct" answer is predicated on that, and many formulations one finds (including the one in her column) are not that.

[0] It has to be specified that a) Monty knows what's behind which door, and b) he will on purpose always open a door such that there's a goat behind it.

kgwgk•2d ago

> many formulations one finds (including the one in her column)

As far as I’ve been able to find the text below would be the original question (“the host, who knows what’s behind the doors, opens”) and answer (“the host, who knows what’s behind the doors and will always avoid the one with the prize, opens”).

Are you referring to that?

https://web.archive.org/web/20130121183432/http://marilynvos...

Suppose you’re on a game show, and you’re given the choice of three doors. Behind one door is a car, behind the others, goats. You pick a door, say #1, and the host, who knows what’s behind the doors, opens another door, say #3, which has a goat. He says to you, "Do you want to pick door #2?" Is it to your advantage to switch your choice of doors? [Craig F. Whitaker - Columbia, Maryland]

Yes; you should switch. The first door has a 1/3 chance of winning, but the second door has a 2/3 chance. Here’s a good way to visualize what happened. Suppose there are a million doors, and you pick door #1. Then the host, who knows what’s behind the doors and will always avoid the one with the prize, opens them all except door #777,777. You’d switch to that door pretty fast, wouldn’t you?

FabHK•2d ago

Yes, indeed, thanks for digging it up.

You see that Marilyn's answer contains the clarified correct specification ("the host, who knows what’s behind the doors and will always avoid the one with the prize..."), but the question does not fully specify it.

Thus, ignoring the clarification in the answer (which furthermore pertains to a modified problem), one could interpret the question differently: the host randomly opens a door [0], this specific day it happens to contain a goat, should you switch?

And that ambiguity has given rise to so much spilled ink (and, by the way, the misconception that statistic professors don't understand probability theory).

[0] Note that specifying that the host knows what's behind what door doesn't help. They could still pick a door randomly.

kgwgk•2d ago

Did any of those readers of her column, including PhD and mathematicians, that declared her solution wrong do so because they objected to the clarification? (Does it make sense to declare the answer wrong ignoring the answer?)

FabHK•2d ago

I think they didn't notice the significance of that clarification (which, again, was given in the context of a modified problem). Reading the exchanges again, though, I see that many respondents were not only mistaken, but also impolite or sexist in accusing Marylin of being wrong.

If you specify the problem carefully, anyone with some training in probability should get the answer right. But clearly back then it was a head scratcher.

I wonder whether there've been other problems like that, or we will encounter similar ones: elementary and easy to understand, yet many people get it confidently wrong, until the correct solution permeates through culture.

kgwgk•2d ago

> [that clarification] was given in the context of a modified problem

I don’t understand what do you mean by that. The clarification was part of the original answer in the context of the original question. The clarification was stated again and again and again. If that’s a “modified problem” so be it but that was the problem being discussed with her readers.

FabHK•1d ago

The modification I'm referring to is the one Marylin suggests with a million doors, with Monty opening 999,998 of.

kgwgk•1d ago

Ok, so the modification was the "Here’s a good way to visualize what happened." explanation.

I would say that the whole (short) paragraph is intended to clarify why the answer to the question is "Yes; you should switch."

Maybe it was not a good way to visualize what happened after all. Experience shows that some readers were unable to understand the argument.

However, none of them complained about the explanation being wrong because those were two completely different problems (only in the second one the host will always avoid the door with the prize, apparently).

SAI_Peregrinus•2d ago

The original problem is the one you're thinking of as modified, since the original problem was about a real-world game show, and the rules of the real-world show in question included Monty Hall (the host) opening a door that the player hadn't picked and which did not contain the car. The problem assumed familiarity with the game show, the clarification is for people who didn't watch the show.

FabHK•1d ago

The modification I'm referring to is the one Marylin suggests with a million doors, with Monty opening 999,998 of.

fenomas•2d ago

I don't think this complaint stands at all. The original question is stated colloquially, not formally, and vos Savant's interpretation of it is the one that's consistent with the question's language and with how game shows work. Even if other interpretations are possible, hers is the one a normal reader would make, and the one the letter-writer apparently intends.

> And that ambiguity has given rise to so much spilled ink (and, by the way, the misconception that statistic professors don't understand probability theory).

I don't think this stands either - the letters quoted don't say anything about the distinction you're making. Unless they're all fictional or selectively edited, a bunch of PhDs really did get the puzzle wrong.

bmm6o•2d ago

There's also the cultural context of Monty Hall being a real person who had a real game show, on which he really opened doors with goats behind them. Most readers of her column would have been familiar with the mechanics of the show. And the question doesn't really make sense if there's a chance that he opens the door with a prize, there's no more hidden information in that case.

michaelcampbell•2d ago

> Suppose there are a million doors,...

This is a common way to explain it, but I always found it less intuitive; not sure why.

My go to thought experiment along these lines is rather this:

Adjust the game to where both you and your friend are playing at the same time. You _NEVER_ switch doors, and your friend _ALWAYS_ does, when given the choice by Monty.

Since you never switch, no matter what Monty does you know you're going to win 1/3 of the time. Since you both know Monty is always going to show a "goat" door, ONE of you must win, so your friend MUST WIN ALL THE TIMES YOU DO NOT. Since you win 1/3 of the time, the wins "left" is 2/3 of the time.

taid9iK-•1d ago

Oh I like that one. I am having real trouble with the commonly used explanations.

griffzhowl•2d ago

> [0] It has to be specified that a) Monty knows what's behind which door, and b) he will on purpose always open a door such that there's a goat behind it.

You don't need condition (a) here. It's enough to just stipulate that a door with a goat behind it will be opened, however that comes about

FabHK•1d ago

Sure. It's really b) that's important (which generally requires a). What was specified in the question though was a).

actionfromafar•2d ago

A very smart person is named "savant" and "her grandmother's name was Savant; her grandfather's, vos Savant".

https://en.wikipedia.org/wiki/Marilyn_vos_Savant

sparsely•2d ago

For the last one, why does the "born on a Tuesday" information change the result? I don't see how it isn't equivalent to "born on a day", since the day of the week has no connection to the rest of the scenario. I understand why "at least one boy" does matter.

zaik•2d ago

If you accept the Bayes theorem, the answer is that the likelihood of "At least one boy is born on a Tuesday" is not the same for different numbers of boys. The more boys the more likely the statement is true. Therefore this information is indicative of how many boys Mrs. Chance has.

jammaloo•2d ago

It seems to me that it comes down to how the day of the week was picked.

If they picked a random day of the week, and there was only one boy, then there is only a 1/7 chance of a boy being born on that day.

If they have one boy, who was born on a Tuesday, and that is why they picked the day, then there is a 100% chance of a boy being born on that day, so no additional information is conferred.

fenomas•2d ago

Puzzles like that one have always seemed dishonest to me. It only makes sense if you start from the conclusion that it's meant to illustrate Bayes rule, and then work backwards to the assumption that the predicate "boy born on a Tuesday" is supposed to be independent of who's being asked about.

But in plain English, ".. At least one of them is a boy born on Tuesday" suggests the speaker is giving a fact that was chosen because it's true of the person spoken about - like if the kids were both chosen on Thursday then that would be the day named. And read that way, the Bayes illustration doesn't stand and the "correct" answer makes no sense.

To make it honest, it should really be worded like: "Mrs. Chance has two children of different ages. You ask whether at least one of them is a boy born on Tuesday, and you are told yes. What is the probability that both of them are boys?" Or am I missing something?

kgwgk•2d ago

> If you accept the Bayes theorem

That doesn’t make a lot of sense. A theorem is just a theorem. It’s proved, and in this case the proof is trivial.

The question is whether you accept that the description of the problem in terms of conditional probabilities is adequate, and then whether you accept that the values assigned to those conditional probabilities are appropriate.

amluto•2d ago

I think this particular question illustrates a major oversimplification in the entire premise of the webpage. If you have a probability problem that isn't well-specified, no amount of "mechanical" magic, Bayesian or otherwise, will give you a fully correct answer, since you are missing relevant details.

Let's consider this particular question:

"Mrs. Chance has two children of different ages. At least one of them is a boy born on Tuesday. What is the probability that both of them are boys?" [Source: https://news.ycombinator.com/item?id=45052502]

The question is bizarre and there are planty of ways to interpret it.

Here's how I guessed it was intended to be interpreted: I'm a person who just met a stranger, and the stranger told me they had two children of different ages (i.e. not twins). I did a tiny bit of investigation and found an undated article stating that they had a baby boy born on an unstated Tuesday in the past. The article gave no indication as to whether any other children had been born yet. I believe, a priori, that each of the strangers' children is either a boy or a girl with 50% probability each, i.i.d. for both children. I believe that there are no further biases in the article (e.g. if the child in question was a second child, then the article would have been equally likely to be published and found by me regardless of the gender of the first child).

The only relevant thing I learn is that the children were not both girls. Then the problem is essentially identical to the sisters problem higher up on the webpage, and there is a 1/3 a posteriori probability that both children are boys.

Now let's interpret the same question differently. I meet a stranger, and the stranger mentions that they have two children, and I determine, a priori, that each child is a boy or a girl, with 50% probability for each child, i.i.d. For some bizarre reason, I decide to ask the stranger "Do you have a son who was born on a Tuesday. Answer yes or no, and do not give any other information!" and, for some bizarre reason the stranger actually remembers or calculcates the answer and answers honestly, and the answer is yes. And the probability that the stranger gives a correct, honest answer is independent of the birth dates and genders of both children, which is a very strange assumption indeed. Now you get the scenario in the webpage: it is dramatically more likely that there was a boy born on a Tuesday if there were two boys than if there were only one.

The older HN thread that the article links has some fun comments giving even more differing interpretations (e.g. that "born on a Tuesday" refers to the most recent Tuesday, in which case, if the children are not twins, one might reasonably conclude that the younger child is a newborn boy and that absolutely no information is gained about the elder child.

This whole situation illustrates one of my major pet peeves about the way that statistics is often done. The real world in complex, and there are many reasonable experiments that one might do, and there are many reasonable questions one might ask about what was learned from the experiment. Nonetheless, it's very very common to see a conclusion that consists almost exclusively of something to the effect of "X significantly improved Y", and, while this might be mechanically correct in the sense that you could shove the numbers into your favorite statistics software and get that answer, you don't know enough details about the study to translate that result into any useful answer to any clearly stated question about the world.

JeffJor•2d ago

Mr. Bertrand has (exactly - this needs to be included) two children (not twins, which is not quite the same as different ages). A gender, and a day of the week, that apply to at least one of his children have been written inside a sealed envelope. What is the probability that both children have that gender?

In this problem, we have no gender- or day-specific information. So the answer can only be the probability that he has two of the same gender. Which is 1/2.

Now open the envelope. If the answer changes to P based on what you see written, it has to change to the same P regardless of what you see written. Which means you didn't need to unseal the envelope; the answer was P before, not 1/2.

This is what Joseph Bertrand identified as his Box Paradox in 1889. That word was used to describe an actual contradiction, not a non-intuitive result. It disproves any answer except P=1/2. FOR ANY OF THESE PROBLEMS.

In fact, it is the same reason why the Monty Hall Problem's answer is what it is. Many "explanations" will claim that your original probability can't change, but never justify it. This is the justification - if it changes when one door is opened, it must change the same way when either door is opened.

knappa•2d ago

You could definitely replace "Tuesday" with something like that and part of the pedagogical purpose of the problem is for people to question this. The actual effect comes from not distinguishing the boys. That increases the likelihood that at least one of them will be born on any particular day, upweighing the likelihood that there are larger numbers of boys. i.e. You just get, on average, better coverage of boys-born-on-Tuesday when there are more boys.

FabHK•2d ago

Well, "born on a day" would not convey any information unless it means "during daytime". If that has probability 1/2, the answer would be 3/7. With Tuesday (or, indeed, any other weekday, with probability 1/7), it is 13/27.

MontyCarloHall•2d ago

Recent related discussion: https://news.ycombinator.com/item?id=45051798

qwertytyyuu•2d ago

I feel this the case because pretty much all probability riddles are bayes rules problems which is famously not intuitive

0xfaded•2d ago

Just a tidbit for remembering bayes rule:

P(A|B)P(B) = P(A,B) = P(B|A)P(A)

The familiar forms

P(A|B) = P(A,B)/P(B) = P(B|A)P(A)/P(B)

immediately follow

glitchc•2d ago

Comma is not a defined operator. Do you mean intersection?

0xfaded•2d ago

It's a joint distribution :)

https://en.m.wikipedia.org/wiki/Joint_probability_distributi...

bobbylarrybobby•2d ago

In words, for those who need help with the first line: the chance that A and B both happen is the chance that A happens, and (given that A happened) B happens. (Or vice versa.)

Show HN: Greppers – fast CLI cheat sheet with instant copy and shareable search

Oldest recorded transaction

Qwen3 30B A3B Hits 13 token/s on 4xRaspberry Pi 5

We hacked Burger King: How auth bypass led to drive-thru audio surveillance

The maths you need to start understanding LLMs

Using Claude Code SDK to reduce E2E test time

Anthropic agrees to pay $1.5B to settle lawsuit with book authors

Processing Piano Tutorial Videos in the Browser

The World War Two bomber that cost more than the atomic bomb

AI surveillance should be banned while there is still time

Why language models hallucinate

Europe enters the exascale supercomputing league with Jupiter

The life-changing Sarah Paine framework

Baby's first type checker

Normalization of deviance (2015)

Rug pulls, forks, and open-source feudalism

Our love letter to Internet Relay Chat [video]

GigaByte CXL memory expansion card with up to 512GB DRAM

Speeding up Unreal Editor launch by not spawning unused tooltips

AI hype is crashing into reality. Stay calm

Kenvue stock drops on report RFK Jr will link autism to Tylenol during pregnancy

Video Game Blurs (and how the best one works)

996

A Software Development Methodology for Disciplined LLM Collaboration

The repercussions of missing an Ampersand in C++ and Rust

Purposeful animations

The Universe Within 12.5 Light Years

Novel hollow-core optical fiber transmits data faster with record low loss

Patterns, Predictions, and Actions – A story about machine learning

GLM 4.5 with Claude Code

Show HN: Greppers – fast CLI cheat sheet with instant copy and shareable search

Oldest recorded transaction

Qwen3 30B A3B Hits 13 token/s on 4xRaspberry Pi 5

We hacked Burger King: How auth bypass led to drive-thru audio surveillance

The maths you need to start understanding LLMs

Using Claude Code SDK to reduce E2E test time

Anthropic agrees to pay $1.5B to settle lawsuit with book authors

Processing Piano Tutorial Videos in the Browser

The World War Two bomber that cost more than the atomic bomb

AI surveillance should be banned while there is still time

Why language models hallucinate

Europe enters the exascale supercomputing league with Jupiter

The life-changing Sarah Paine framework

Baby's first type checker

Normalization of deviance (2015)

Rug pulls, forks, and open-source feudalism

Our love letter to Internet Relay Chat [video]

GigaByte CXL memory expansion card with up to 512GB DRAM

Speeding up Unreal Editor launch by not spawning unused tooltips

AI hype is crashing into reality. Stay calm

Kenvue stock drops on report RFK Jr will link autism to Tylenol during pregnancy

Video Game Blurs (and how the best one works)

996

A Software Development Methodology for Disciplined LLM Collaboration

The repercussions of missing an Ampersand in C++ and Rust

Purposeful animations

The Universe Within 12.5 Light Years

Novel hollow-core optical fiber transmits data faster with record low loss

Patterns, Predictions, and Actions – A story about machine learning

GLM 4.5 with Claude Code

Use Bayes rule to mechanically solve probability riddles

Comments