frontpage.

Most PHI de-identification pipelines are stateless: detect identifiers, remove them, done. The problem is that re-identification risk doesn't work that way in practice.

A name fragment that's harmless in record #1 becomes identifying when it co-occurs with a location in record #47 and a timestamp in record #203. Static masking can't see that.

This project treats de-identification as a stateful control problem instead. The system maintains a per-subject exposure graph across time and modalities, computes rolling re-identification risk, and dynamically escalates masking strength only when cumulative exposure justifies it.

The core idea: privacy protection as a feedback loop, not a preprocessing step.

A few things I found interesting building this: - Cross-modal linkage (text + ASR + image proxy + waveform headers) creates non-obvious re-ID surfaces - Pseudonym versioning on risk escalation lets you contain linkage continuity without global reprocessing - The privacy–utility tradeoff is actually controllable if you model exposure state explicitly

All experiments run on synthetic streaming data (no real PHI). Reproducible from source. Colab demo included.

Repo: https://github.com/azithteja91/phi-exposure-guard

Happy to discuss the architecture, the RL policy design, or the tradeoffs vs. existing de-ID approaches.

History of Scientific Glass

Codex for Windows

Show HN: FileShot – zero-knowledge file sharing, 50GB/file free, no paywalls

NanoGPT Slowrun: Language Modeling with Limited Data, Infinite Compute

Downdetector and Speedtest sold to Accenture for $1.2B

Show HN: Sanctuary for the most beautiful sentences, curated by people

Father sues Google, claiming Gemini chatbot drove son into fatal delusion

Show HN: ChessWoodie – structured chess tactics training

The Iran War's Most Precious Commodity Isn't Oil, It's Desalinated Water

Pure Independence

Stop Rebuilding Front End Apps for Environment Variables (REP RFC)

Console Inbox

Distributed Systems Simulator

Show HN: I improved my handwritten math OCR (now preserves derivations)

Autonomous Weapons vs a Nineteen-Year-Old at a Checkpoint

The Shortcut No One Talks About in Early Stage Startups

Solar in poor countries is creating a lead hazard

Show HN: Bashd – Helper scripts for bulk CLI file management

No-backprop SNN scores 98.2% on Split-MNIST task-incremental, age 14

Major data leak forum dismantled in international cybercrime operation

New RAGLight feature: deploy a RAG pipeline as a REST API with one command

Monday CEO "If you think about any company, 90% of the context isn't documented"

The Best AI Tools That Respect Your Privacy

Agent frameworks are solving the wrong problem

Ask HN: Will using LinkedIn with OpenClaw get me banned?

A taxonomy of text output (from tools that want to be too clever)

Ask HN: Will using WhatsApp with OpenClaw get my account banned?

Who Writes the Bugs? A Deeper Look at 125,000 Kernel Vulnerabilities

The uncomfortable truth about getting people off US tech

Eight Sleep raises $50M at $1.5B valuation

Show HN: Modeled healthcare de-identification as longitudinal RL control problem