In high-conflict litigation, "black box" redactions are often a disaster waiting to happen. I realized that many people (and even law firms) use civilian-grade tools that leave "Ghost Layers"—original text layers or metadata underneath the digital ink.
I built the Shadow Report as a free forensic tool to prove this. You can upload a redacted page, and it scans for:
Ghost Text Layers: Checking if the searchable PDF layer still exists beneath the redaction blocks (using pypdf layer inspection).
Metadata Leaks: Extracting Author/Producer info that reveals who actually drafted the document.
Image Fingerprints: Scraping EXIF data that can geo-locate or time-stamp "anonymous" evidence.
Backstory: This is a component of a larger project called Exit Protocol. I started it after a friend was quoted $50k for a forensic accountant to trace "separate property" in a divorce. The math they use—the Lowest Intermediate Balance Rule (LIBR)—is deterministic, but accountants do it manually in Excel. I automated the LIBR math to handle 10k+ transactions via Celery/Postgres.
Stack:
Django 5.0 (Monolith) / Postgres pypdf & Pillow for the forensic scanning Celery for async processing of massive bank discoveries Air-gapped "BYOK" model for law firms (Docker)
I'd love feedback on:
Are there other "Ghost Layer" detection methods I should implement (e.g., color-space delta analysis)? For those in LawTech: How do you handle "PDFs from hell" (scanned, rotated, handwritten notes)? I'm currently using a custom OC-3 implementation.
Try the Redaction Check: https://exitprotocols.com/redaction-check/
Main Site: https://exitprotocols.com/
cd_mkdir•1h ago
I’m a dev who got frustrated seeing forensic accountants charge $500/hr to do this in spreadsheets. So I built Exit Protocol to automate the forensic tracing and "impeachment" of financial lies.