Show HN: Epstein's emails reconstructed in a message-style UI (OCR and LLMs)

https://github.com/Toon-nooT/epsteins-phone-reconstructed

46•toon-noot•1mo ago

This project reconstructs the Epstein email records from the recent U.S. House Oversight Committee releases using only public-domain documents (23,124 image files + 2,800 OCR text files).

Most email pages contain only one real message, buried under layers of repeated headers/footers. I wanted to rebuild the conversations without all the surrounding noise.

I used an OCR + vision-LLM pipeline to extract individual messages from the email screenshots, normalize senders/recipients, rebuild timestamps, detect duplicates, and map threads. The output is a structured SQLite database that runs client-side via SQL.js (WebAssembly).

The repository includes the full extraction pipeline, data cleaning scripts, schema, limitations, and implementation notes. The interface is a lightweight PWA that displays the reconstructed messages in a phone-style UI, with links back to every original source image for verification.

Live demo: https://epsteinsphone.org

All source data is from the official public releases; no leaks or private material.

Happy to answer questions about the pipeline, LLM extraction, threading logic, or the PWA implementation.

Comments

pfd1986•1mo ago

The convo with Noam Chomsky is interesting. Deepak Chopra one talking about Trump being 'loco' is quiet funny.

Neat data visualization solution!

toon-noot•1mo ago

Thanks!

dizhn•1mo ago

Android/Firefox. Nothing's happening when I tap the icons on the demo site.

toon-noot•1mo ago

Thanks for the feedback. i'll try to reproduce. I spent more time with the data pipeline then with testing the UI across platforms...

marstall•1mo ago

brilliant. feel bad asking for something more - but an inline annotation of who these people are would take it over the top.

palmotea•1mo ago

One nit: the message view seems to auto-hyphenate long words on line-breaks to pack in more text, but one of the things that's struck me about Epstein is how utterly incompetent he was with punctuation. Those correctly-inserted hyphens distract from that impression.

pea•1mo ago

This is really cool, I enjoyed going through them in this form. Thanks

lights0123•1mo ago

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Show HN: A luma dependent chroma compression algorithm (image compression)

Show HN: Craftplan – Elixir-based micro-ERP for small-scale manufacturers

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Show HN: If you lose your memory, how to regain access to your computer?

Show HN: I spent 4 years building a UI design tool with only the features I use

Show HN: Django-rclone: Database and media backups for Django, powered by rclone

Show HN: Witnessd – Prove human authorship via hardware-bound jitter seals

Show HN: More beautiful and usable Hacker News

Show HN: Smooth CLI – Token-efficient browser for AI agents

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Show HN: PalettePoint – AI color palette generator from text or images

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

Show HN: Slack CLI for Agents

Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

Show HN: Stacky – certain block game clone

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

Show HN: ARM64 Android Dev Kit

Show HN: A toy compiler I built in high school (runs in browser)

Show HN: Micropolis/SimCity Clone in Emacs Lisp

Show HN: Env-shelf – Open-source desktop app to manage .env files

Show HN: Nginx-defender – realtime abuse blocking for Nginx

Show HN: MCP App to play backgammon with your LLM

Show HN: Horizons – OSS agent execution engine

Show HN: Daily-updated database of malicious browser extensions

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

Show HN: I built Divvy to split restaurant bills from a photo