A short statistical reasoning test

https://emiruz.com/post/2025-08-17-statistical-reasoning/

22•usgroup•2h ago

Comments

jldugger•37m ago

( replying to a now deleted post)

>> the uncertainty in the number of trials > Has no meaning to me.

What the author is trying to get at in the admittedly poorly worded question is that the trials are noisy measures of an underlying effect. Your job is to sort by effect size, while accounting for the random chance that a low sample size trial just got unlucky.

You might argue that the question is much harder than the author assumes, since your best guess at the actual effect size seems like it should still just be the success rate, even if the low sample size trials have wider error bars. You'd need to come up with some sort of heuristic that says why 7/9 deserves a lower rank than 50/70 using binomial confidence intervals.

Probably that heuristic is intended to be a bayesian approach? Like, if you add just two successes and two failures to each scenario as a prior, thats enough to put the 50/70 option ahead.

jldugger•30m ago

And I guess since they answer the questions at the bottom, it seems their intent is indeed the simplistic approach

> The lower bound of which can be used to order the fractions, and so control the risk of over-estimation.

It not clear to me from the question whether the cost of a mistake is in the over-estimating the underlying effect or in misranking the effects, and that seems like it would drive your heuristic selection.

usgroup•23m ago

From the question:

“However, it is very important that the uncertainty in the number of trials is taken into account because over-estimating a fraction is a costly mistake.“

Seems fairly clear to me that you’re supposed to use a lower bound estimate to take into account variance on the fraction due to the number of trials in a way to bounds the chance of over estimation.

Further, there is no need for a heuristic when there a several statistical models for this exact problem with clear properties. Some are given in the answer.

thekoma•9m ago

Out of context, the expression "the uncertainty in the number of trials" would refer to missing knowledge in terms of how many trials actually ran.

In the context of the post this doesn't make sense, so the reader is left to hypothesize what the writer actually meant.

taylorius•8m ago

I think "uncertainty due to the number of trials" would be clearer.

kruffalon•15m ago

I wrote the deleted comment you are replying to.

The essence of my comment was that this text/test is not for me (one person of the general public) but more like a few leetcode-style questions for statisticians.

Your attempt to explain what I didn't understand just proves my point as I don't really understand what you are saying either.

And that's ok: this is just not for me! (And that's why I deleted my original comment)

kqr•1m ago

The first question is impossible to answer correctly because it does not say how much more expensive overestimation is compared to underestimation. It implicitly assumes 19:1 given that it's ordering by the 0.05th quantile of the posterior distribution, but that's information not contained in the question.

Web apps in a single, portable, self-updating, vanilla HTML file

A gigantic jet caught on camera: A spritacular moment for NASA astronaut

Unification (2018)

A short statistical reasoning test

Claudia – Desktop companion for Claude code

Clojure Async Flow Guide

Llama-Scan: Convert PDFs to Text W Local LLMs

The Lives and Loves of James Baldwin

The Enterprise Experience

Google admits anti-competitive conduct involving Google Search in Australia

Viking-Age hoard reveals trade between England and the Islamic World

Nvidia Tilus: A Tile-Level GPU Kernel Programming Language

Show HN: Doxx – Terminal .docx viewer inspired by Glow

Leeches and the Legitimizing of Folk-Medicine

Show HN: OverType – A Markdown WYSIWYG editor that's just a textarea

Mangle – a language for deductive database programming

Modifying other people's software

Derivatives, Gradients, Jacobians and Hessians

Show HN: NextDNS Adds "Bypass Age Verification"

SystemD Service Hardening

Scientists discover surprising language 'shortcuts' in birdsong – like humans

Non-Uniform Memory Access (NUMA) is reshaping microservice placement

Show HN: ASCII Tree Editor

ArchiveTeam has finished archiving all goo.gl short links

I Prefer RST to Markdown (2024)

BBC Micro, ancestor to ARM

A Visual Exploration of Gaussian Processes (2019)

Why Nim?

Gazan woman flown to Italy dies of malnutrition

Fun with Finite State Transducers

A short statistical reasoning test

Comments

Web apps in a single, portable, self-updating, vanilla HTML file

A gigantic jet caught on camera: A spritacular moment for NASA astronaut

Unification (2018)

A short statistical reasoning test

Claudia – Desktop companion for Claude code

Clojure Async Flow Guide

Llama-Scan: Convert PDFs to Text W Local LLMs

The Lives and Loves of James Baldwin

The Enterprise Experience

Google admits anti-competitive conduct involving Google Search in Australia

Viking-Age hoard reveals trade between England and the Islamic World

Nvidia Tilus: A Tile-Level GPU Kernel Programming Language

Show HN: Doxx – Terminal .docx viewer inspired by Glow

Leeches and the Legitimizing of Folk-Medicine

Show HN: OverType – A Markdown WYSIWYG editor that's just a textarea

Mangle – a language for deductive database programming

Modifying other people's software

Derivatives, Gradients, Jacobians and Hessians

Show HN: NextDNS Adds "Bypass Age Verification"

SystemD Service Hardening

Scientists discover surprising language 'shortcuts' in birdsong – like humans

Non-Uniform Memory Access (NUMA) is reshaping microservice placement

Show HN: ASCII Tree Editor

ArchiveTeam has finished archiving all goo.gl short links

I Prefer RST to Markdown (2024)

BBC Micro, ancestor to ARM

A Visual Exploration of Gaussian Processes (2019)

Why Nim?

Gazan woman flown to Italy dies of malnutrition

Fun with Finite State Transducers