I built PII Firewall because I got tired of watching "privacy" APIs secretly pipe user data to cloud AI models. If you're using GPT/Claude to redact PII, you're literally giving the AI your PII.
What makes this different:
- Zero AI – deterministic regex + 30 checksum validators (Luhn, Verhoeff, Mod 11/97) - Zero storage – processes on Cloudflare edge, no logs, no persistence - 152 PII types – SSN, Aadhaar, 50+ country IDs, 20 API key formats, crypto wallets - Two modes: `/fast` (2-5ms) for structured PII, `/deep` (5-15ms) adds names/addresses via 2000+ name gazetteer
The technical approach:
Instead of ML inference, I use combined V8-optimized regex with heuristic pre-scanning. Clean text (90% of requests) skips pattern matching entirely. For IDs that require it, I implemented full checksum validation:
- Credit cards: Luhn - Indian Aadhaar: Verhoeff - Chinese ID: ISO 7064 Mod 11 - Brazilian CPF/CNPJ: Dual Mod 11 - IBAN: Mod 97
Runs on Cloudflare Workers (pure JS, no WASM), so no cold starts.
Why I'm sharing:
Enterprise PII solutions cost $50K+/year. I wanted to make this accessible to indie devs, startups, and anyone building AI features who doesn't want to become a data liability. The $5/mo tier covers most use cases.
Would love feedback on the detection coverage or edge cases I might be missing.
Raviteja_•2h ago
Why no AI?
The irony of sending PII to an AI model to detect PII is lost on most "privacy" APIs. This is pure algorithmic detection – the same approach your credit card company uses to validate card numbers.
What's validated (not just pattern-matched): - Credit cards → Luhn checksum - Aadhaar → Verhoeff (the algorithm that catches single-digit and transposition errors) - IBAN → Mod 97 (same as banks use) - Singapore NRIC → Mod 11 with offset - Brazilian CPF → Dual Mod 11
Latency breakdown: - Heuristic scan: O(n) single pass for trigger characters (@, -, digits) - Pattern matching: Only runs if triggers found - Validation: Only on pattern matches - Total: 2-5ms for /fast, 5-15ms for /deep
False positive mitigation: - "Order ID: 123-45-6789" won't trigger SSN (negative context) - Timestamps won't match phone patterns (separator requirements) - Random 16-digit numbers won't trigger credit card (Luhn must pass)