frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Atari Means Business with the Mega ST

https://www.goto10retro.com/p/atari-means-business-with-the-mega
12•rbanffy•1h ago•2 comments

Figma Slides Is a Beautiful Disaster

https://allenpike.com/2025/figma-slides-beautiful-disaster
132•tobr•6h ago•59 comments

RenderFormer: Neural rendering of triangle meshes with global illumination

https://microsoft.github.io/renderformer/
186•klavinski•8h ago•41 comments

Progressive JSON

https://overreacted.io/progressive-json/
307•kacesensitive•11h ago•148 comments

Why DeepSeek is cheap at scale but expensive to run locally

https://www.seangoedecke.com/inference-batching-and-deepseek/
41•ingve•4h ago•25 comments

I like to install NixOS (declaratively)

https://michael.stapelberg.ch/posts/2025-06-01-nixos-installation-declarative/
63•todsacerdoti•6h ago•34 comments

The Future of Comments Is Lies, I Guess

https://aphyr.com/posts/388-the-future-of-comments-is-lies-i-guess
17•zdw•2d ago•6 comments

Codex CLI is going native

https://github.com/openai/codex/discussions/1174
3•bundie•1h ago•0 comments

RSC for Lisp Developers

https://overreacted.io/rsc-for-lisp-developers/
3•bundie•1h ago•0 comments

Structured Errors in Go

https://southcla.ws/structured-errors-in-go
61•todsacerdoti•7h ago•23 comments

Georgists Valued Land in the 1900s

https://progressandpoverty.substack.com/p/how-georgists-valued-land-in-the
77•surprisetalk•1d ago•38 comments

Show HN: A Implementation of Alpha Zero for Chess in MLX

https://github.com/koogle/mlx-playground/tree/main/chesszero
31•jakobfrick•2d ago•2 comments

Google AI Edge – on-device cross-platform AI deployment

https://ai.google.dev/edge
28•nreece•5h ago•1 comments

A Pokémon battle simulation engine

https://github.com/pkmn/engine
21•rickcarlino•2d ago•6 comments

An optimizing compiler doesn't help much with long instruction dependencies

https://johnnysswlab.com/an-optimizing-compiler-doesnt-help-much-with-long-instruction-dependencies/
14•ingve•5h ago•1 comments

Father Ted Kilnettle Shrine Tape Dispenser

https://stephencoyle.net/kilnettle
120•indiantinker•6h ago•23 comments

Ovld – Efficient and featureful multiple dispatch for Python

https://github.com/breuleux/ovld
69•breuleux•2d ago•20 comments

A Beautiful Technique for Some XOR Related Problems

https://codeforces.com/blog/entry/68953
22•blobcode•5h ago•2 comments

Browser extension (Firefox, Chrome, Opera, Edge) to redirect URLs based on regex

https://github.com/einaregilsson/Redirector
28•Bluestein•5h ago•11 comments

Snake on a Globe

https://engaging-data.com/snake-globe/
39•rishikeshs•2d ago•10 comments

Show HN: Patio – Rent tools, learn DIY, reduce waste

https://patio.so
128•GouacheApp•12h ago•72 comments

New adaptive optics shows details of our star's atmosphere

https://nso.edu/press-release/new-adaptive-optics-shows-stunning-details-of-our-stars-atmosphere/
109•sohkamyung•13h ago•13 comments

I like Svelte more than React (it's store management)

https://river.berlin/blog/why-i-like-svelte-more-than-react/
3•adityashankar•1h ago•2 comments

Reviving Astoria – Windows's Lost Android

https://trungnt2910.com/astoria-windows-android/
43•upintheairsheep•7h ago•16 comments

Why Use Structured Errors in Rust Applications?

https://home.expurple.me/posts/why-use-structured-errors-in-rust-applications/
26•todsacerdoti•7h ago•11 comments

Stepping Back

https://rjp.io/blog/2025-05-31-stepping-back
76•rjpower9000•11h ago•26 comments

Cinematography of "Andor"

https://www.pushing-pixels.org/2025/05/20/cinematography-of-andor-interview-with-christophe-nuyens.html
74•rcarmo•2h ago•64 comments

Tldx – CLI tool for fast domain name discovery

https://github.com/brandonyoungdev/tldx
39•Brandutchmen•8h ago•18 comments

CCD co-inventor George E. Smith dies at 95

https://www.nytimes.com/2025/05/30/science/george-e-smith-dead.html
116•NaOH•16h ago•10 comments

A Lean companion to Analysis I

https://terrytao.wordpress.com/2025/05/31/a-lean-companion-to-analysis-i/
230•jeremyscanvic•19h ago•23 comments
Open in hackernews

How large should your sample size be?

https://vickiboykis.com/2015/08/04/how-large-should-your-sample-size-be/
38•sebg•4d ago

Comments

whatever1•1d ago
The statisticians will tell you that it was not big enough, hence their wrong projection was not their fault (again).
potamic•1d ago
How does this relate to things like nation-wide elections? For a 100 million population, a 99% confidence at 0.1 interval needs a sample size of only a million. Does this mean it's not really important that every one must vote? Even an abysmal voter turnout is ultimately representative of the entire population?
YZF•1d ago
If it's a random sample then sure.
redtaperat•1d ago
For 100 million people and a confidence interval of 99% you only need 3000-4000 samples, depending on your moe.
01HNNWZ0MV43FF•1d ago
At that point just elect them! https://en.wikipedia.org/wiki/Sortition
MrJohz•1d ago
A good sample needs to be random and representative of the population as a whole, otherwise you introduce sampling bias. Imagine trying to do a survey of what people's favourite fast food restaurants are, but doing it inside a McDonald's — it doesn't matter how large your sample is, it's going to be heavily biased. This is why survey companies spend a lot of effort trying to find random, representative samples of the population, and often weighting their samples so that they match the target population even more.

If we treat elections like a survey, then they have a massive inherent bias to the sampling method: the people who will get "surveyed" are the ones who are engaged enough to get registered, and then willing to go to a physical polling station and vote. This will naturally bias towards certain types of people.

In practice, we don't treat elections like a survey. If we did, we'd spend a lot of time afterwards weighting the results to figure out what the entire country really thought. But that has its own flaws, and ultimately voting is a civic exercise. You can do it, you can avoid it: that choice is yours, and ultimately part of your vote. In a way, you could argue that the sample size for an election is 100% of the population, where "for whatever reason, I didn't cast a vote" is a valid box to check on this survey.

That said, the whole "samples can be biased" thing is very much relevant for elections because many political groups have an incentive to add additional bias to the samples. That could be as simple as organising pick-ups to allow their voters to get to the polls, or teaching people how to register to vote if they're eligible, but it could also involve making it significantly harder or slower for certain groups (or certain regions) to register or vote.

potamic•1d ago
A 100% sample is unattainable, not just practically, but fundamentally. Even if you made voting mandatory and ensured collection of every single vote, there will always be people who will fudge their vote because they are not interested in the process. I argue that any election is only representative of people engaged with the process and that fundamentally cannot change. Within that subset, you shouldn't need 100% sampling for high confidence.

But agree that random distribution is key to this, but I don't see how that could change with the messaging that every one must vote, versus saying just vote if you're interested.

MrJohz•1d ago
I mean that an election is (theoretically) an 100% sample because every eligible has the ability to interact with the voting process at the level that they choose. So the decision for some people to invalidate their vote, or to vote tactically, or not to vote at all, or whatever else: that's part of the act of taking part in an election. In that sense, you can't not take part in an election, if you're eligible to vote.

This is important, because normally, once you take a sample, you need to analyse that sample to ensure that it is representative, and potentially weight different responses if you want to make it more representative. For example, if you got a sample that was 75% women, you might weight the male responses more strongly to match the roughly 50/50 split between men and women in the general population. But in an election, we don't do this, because the assumption is that if you spoil your ballot or don't take part, that is part of your choice as a citizen.

But I think we're saying the same sort of thing, but in different ways: you can either see "the sample of an election is every citizen, regardless of whether they voted" or "the population of an election is everyone who voted", and in either case the sample is the same as the population, and we can therefore assume that it is representative of the population.

vouaobrasil•1d ago
Beyond simple statistics and random sampling, everyone voting is important because let's say 1000 people are required to determine the result beyond reasonable uncertainty. Then those 1000 consequently hold too much power and thus can be easily bribed and the result affected far more easily than if everyone voted.

Of course, psychologically, everyone needs to vote to have a say. But beyond even that psychological thing, everyone voting is really a security measure against tampering.

potamic•1d ago
This is a problem even with a large turn out because swing voters are a thing and generally the target of manipulation. You may only need to target like a 1000 key swing voters to get your nose ahead of the other contestants. In fact, I would argue that making it easy to bribe would level out the playing field for contestants, otherwise the one with deepest pockets will tend to have the advantage.
HPsquared•1d ago
Democratic systems already have the problem of not accounting for how strongly each voter feels about something. Is it really fair for 51 people weakly in favour of something to overrule 49 who are very strongly against and consider the issue extremely important? That's surely a net negative decision.

Forcing people to vote who aren't interested only makes this effect even worse.

vouaobrasil•17h ago
I never said anything about forcing. Only giving people the chance to vote. As for your comments about democracy, well, I don't think democracy really works on a large scale. It's just probably the best system we have at the moment.
jyounker•1d ago
You nailed it vouaobrasil. The mark of an anti-democratic politician is that they seek to reduce the electorate's size.
rudebwai•1d ago
There is one key element missing from this explanation: the statistical model that you would like to use when you’re estimating. Without this you cannot do sample size estimation.
firesteelrain•1d ago
We know how to calculate the necessary sample size because the Central Limit Theorem tells us that the sampling distribution of the mean (or proportion) becomes approximately normal as the sample size grows. This justifies using Z-scores from the normal distribution to derive confidence intervals - and from there, solve for the sample size needed to achieve a desired margin of error and confidence level.

Article used a Python library without really understanding the reason or science behind the result. Knowing this can help when you read an article or watch a news report where they quote a study that says “a study of 300 people…” well why 300 people? You can reasonably assume that the researchers used the CLT.

Ntrails•1d ago
The real world constraints of time and money are non trivially involved in sample size decisions. CLT may be invoked but I do not give the benefit of the doubt to studies being announced at this point
firesteelrain•1d ago
You are right that time, cost, and feasibility often drive sample size decisions more than statistical ideals. My point was just that when researchers cite a specific number (like 300), there’s often a statistical basis tied to confidence levels and margin of error. Skepticism is healthy since not all studies follow best practices
roschdal•1d ago
42
lordnacho•1d ago
Why does the population matter?

Say you are an alien, and you want to know roughly the male-to-female ratio of people. Let's say the true ratio is 50%.

Wouldn't this be done by an unbiased sample that's quite small, regardless of whether there's 100M or 8B people on the planet?

HPsquared•1d ago
How do you know the sample is unbiased?
coderatlarge•1d ago
you could for example be sampling in a country or geography that favors male over female offspring for cultural and social reasons. then you have to refine your research question to further clarify what you are really trying to estimate.
lordnacho•21h ago
My point was that assuming you could solve that bias issue, why would it matter how big the population was?
Etheryte•1d ago
The whole point of sample sizing is to try and make the sample unbiased. If you're an alien, you don't know anything about your sample. As an example, say the alien wants to know the ratio of female to male lions. They sample one hunting pack and conclude that all lions are female.
CorrectHorseBat•1d ago
No, that's a whole different problem which bigger samples don't solve (unless you're getting close to sampling the whole population)
CorrectHorseBat•1d ago
If you go to the extreme, say you sample 100M people of 100M and 100M of 8B then the first number is exact and the second has a very small error. So the error is a function of sample size and population.

In not extreme cases when your population is much bigger than the sample size you're correct that it doesn't really make any difference.

derbOac•1d ago
I wondered about that too.

When I was an undergrad first learning statistics I asked my stats instructor (a grad student) about this issue and they responded with something like "the population size doesn't matter because for the assumptions of the test to be met... such and such..." I kind of accepted that answer — we were talking about asymptotic inferences — but it never seemed quite right to me.

The example I gave was actually motivated in part by a sort of real-world problem I was dealing with: let's say you only want to make inferences about a population of 20 individuals. Certainly if you have a sample of 19, the confidence about the population will be much stronger than if your population is 100 million.

One thing he did say which is probably right, is that that 1/20 you didn't sample might throw things off, so it's more influential in a sense than a single member of a population of 100 million.

At the time I hadn't learned about exact and Jaynesian-permutation statistics, but that's probably the right way to think about finite populations. That is, something like "what are all the outcomes you could observe, and what proportion of those does my observed result represent?"

It's just that usually our population is so large that the exact test approach becomes infeasible to deal with without approximations, and you end up with the typical classical asymptotic statistics.

It's all maybe a moot point but it's always a good idea to think about the population you're trying to make inferences about. I think that probably includes the population size, and I think population size is probably bigger than you might initially think sometimes.

As for your last question, obtaining an unbiased sample is kind of harder as the number of attributes you're being unbiased with regard to increases. It's a permutation problem again, probably implicit usually with regard to sampling representativeness.