US bans differential privacy in Census data

https://desfontain.es/blog/banning-noise.html

94•nl•1h ago

Comments

whatever1•18m ago

We can make them more accurate by leveraging ICE going door to door.

Pragmata•17m ago

Frankly i see no reason to keep this data private. They should simply publish a full dataset of the census, with no such data coarsening/differential privacy/ etc...

Fundamentally this is public data. If it's to dangerous to make public, it's too dangerous to collect, and people should be aware of exactly what it is.

There are very few things that the state has data on that should not be made public. Census data is simply not one of those things.

publishing should be the default for any data, and to keep it unpublished should require substantially good reasons that impact the country as a whole. Frankly, if it isn't detailed national defence plans, i struggle to see any data that should not be public.

simonw•12m ago

How hard have you thought about this?

The biggest challenge with running a census is getting people to trust you enough to answer your questions.

A lot of census questions are sensitive. The ACS covers topics like citizenship status, disabilities, income, SNAP assistance, languages spoken at home.

If you want accurate information about the people who live in your country you need the census process to feel as safe for people to respond to as possible.

Are you saying the census shouldn't collect any data that people wouldn't be comfortable publishing? Because that's a recipe for a census that is far less useful for helping the country make useful decisions.

mobeets•8m ago

Thank you for writing a much more thoughtful reply to this comment than I was drafting

jonhohle•6m ago

This seems’s like an issue created by congress. the constitution only requires a headcount by state. Maybe they should use another mechanism to collect demographic data. Since the concern is not about representation, but allocation, tax returns seem like an obvious alternative and they are already private and collected at a much more granular level.

abletonlive•4m ago

The census is also used for congressional apportionment and allocating federal funds. People that do not have citizenship status should not be represented.

halJordan•12m ago

That's a good default position, and I think should be our starting point.

But the devil is in the details. If we don't want advertisers constructing semi-complete profiles from simple web interactions then why would we publish 330 million census questionnaires for their use?

UqWBcuFx6NV4r•10m ago

Don’t quit your day job. One guess as to what gender, sexual orientation, and skin colour you have.

toast0•9m ago

> They should simply publish a full dataset of the census, with no such data coarsening/differential privacy/ etc...

They do. After a substantial delay. Pretty handy for geneological research, while protecting privacy for the living.

CAP_NET_ADMIN•5m ago

1. People give the information to the government under the expectation that this data is to be kept private or used in such a way that individual targeting is made impossible, you break that expectation and people will lie or won't give you this data.

2. Without noise injection it's rather simple to do statistical attacks to reverse engineer individual entities.

3. This data is and has already been used in the past to undermine democratic systems by targeting and disenfranchising minorities, as well as gerrymandering the US to hell.

4. "Too dangerous to make public, too dangerous to collect" - this is a false dichotomy. To govern effectively you need sensitive data, but it should be collected and used in a way that's safe for the individuals.

5. Macro level aggregates don't need individual exposure, that's why noise, anonymization and statistical functions are fine.

abletonlive•14m ago

There will be a bunch of people that start off with the premise that this data should be private and make following arguments based on this premise.

So I'll just go ahead and ask, give me good reasons why this data should be private?

My guess is that most of you think we should be counting illegals because they should have representation. And I reject that

simonw•11m ago

How about we should be "counting illegals" so that we know how many of them there are?

(Do you reject that? As someone who uses the phrase "counting illegals" I imagine you would be interested in knowing what that number is.)

abletonlive•9m ago

First off the census is used for determining how many seats are used for congressional apportionment and allocating federal funds.

So unless you're willing to also say that counted illegals cannot used for either of those, then you're just being obtuse.

But if we can agree that they cannot be used for that then sure, lets identify and count them. If we can't identify (make non-private) and count them then why should we trust that those counts are accurate?

Cyph0n•6m ago

It’s because people are significantly more likely to lie or omit some facts if you don’t guarantee their privacy, which means your census data ends up being worth less than a pile of shit.

The alternative is to water down the census questions, which also leads you down the same path (i.e. manure as data).

delichon•14m ago

The dueling political demands of accuracy and privacy are simply incompatible at some level. After reading this, maybe Hanlon's Razor isn't the right standard. Besides malice and stupidity, there is impossibility. Some problems just aren't solvable under certain constraints. I don't envy the statisticians tasked with finding a politically palatable solution to a math problem.

ghaff•5m ago

There's a ton of information in the US that is accessible to various degrees--especially through the the deep web much less background investigations. Unless you're a wealthy person who can set up various levels of trusts you can't really hide them.

You can of course disagree about what what should actually be part of a transparent public record. (Though I suspect a lot of people post-date what was generally available in a "phone book.")

xenophonf•11m ago

This is a gift to reactionary gerrymandering and voting restriction efforts, along with things like yesterday's FBI raid of an Ohio voting rights organization.

https://www.statenews.org/government-politics/2026-06-12/ohi...

Representative Joyce Beatty is from Ohio and was instrumental in stopping Trump from illegally renaming the Kennedy Center.

https://www.theatlantic.com/culture/2026/06/kennedy-center-b...

tbrownaw•9m ago

> Differential privacy makes this trade-off explicit, and thus impossible to ignore. Maybe banning it is a way of pretending that the problem doesn't exist, in the hope that it will go away?

Or it's saying that one of these conflicting goals is more valuable than the other, and so shouldn't be sacrificed for it.

asolove•9m ago

The replies here arguing we should publish it all are wild in the worst kind of first-order thinking way.

It’s a census: it just asks questions.

If you start publishing and weaponizing the data against people with various attributes, they’ll just lie or not answer. And then you are left with worse than nothing: bad data people try to act on.

jmole•8m ago

Ban it from the dataset, add it to the analysis. You can choose your own flavor of noise.

I don't know what the political undertones are here, but at some level you need to have actual ground truth, including "this person/household declined".

Publishing raw data though? That seems like shooting yourself in the foot from a national security perspective, not to mention all the other reasons not to do it.

AI Benchmarks Are Starting to Look Like Emissions Tests

Double Ratchet Algorithm

Show HN: WatchUm – Crime data for any Austin apartment or street address

Show HN: OmnySSH – TUI SSH manager with dashboard, SFTP and snippets (Rust)

Show HN: Feed-Repeat, a Tool to Repeat Old Posts from Web Feeds

Mythos Proves AI Safety Can No Longer Live Inside the Model

Omnigent: A Meta-Harness to Combine, Control and Share Your Agents

The Pulling of Mythos Offline: Why AI KYC Will Fail to Stop Cybercriminals

Continuous Proffessional Development

Rethinking Monorepos in the Age of Agents

Goblin Sharks Caught on Camera in Their Natural Habitat for the First Time

Vindication for Young Elon Musk

Investigating Interactive Energy Harvesting in Battery-Free Games Consoles(2026)

Learning Infrastructure for AI Agents

Show HN: Kamio: Decorate a home for your Tamagotchi-like companion

I built a discovery layer for Discogs record stores

HalluHard: A Hard Multi-Turn Hallucination Benchmark

Show HN:I audited 162 agent-written PRs – 27% were the AI fixing itself

Claude Fable 5 vs. GPT-5.5: Better Planning, Similar Execution

Show HN: A local-first job-search command center (no cloud, no telemetry)

Show HN: Chrome Extension That Removes AI Slop / Spam / Self-Promo from Reddit

A thousand Postgres branches for $1

"Remaining: Szabo, Le Roux, Kleiman, Musk."

Show HN: I made a Unity Audio Manager that saves me 10 hours every project

/proc/self/exe overwrite from within a user namespace

What is the most valuable skill you've learned outside of your job?

Getting World Cup Updates in Slack

What task are humans still better at than AI?

Pocony: A little balcony garden on your phone

The Developer Shortage in 2026: AI Killed the Order Taker, Not the Engineer