Show HN: I built a zero-log PII redaction API – no AI, just regex and checksums

https://pii-firewall-edge-web.vercel.app

1•Raviteja_•1mo ago

Hi HN,

I built PII Firewall because I got tired of watching "privacy" APIs secretly pipe user data to cloud AI models. If you're using GPT/Claude to redact PII, you're literally giving the AI your PII.

What makes this different:

- Zero AI – deterministic regex + 30 checksum validators (Luhn, Verhoeff, Mod 11/97) - Zero storage – processes on Cloudflare edge, no logs, no persistence - 152 PII types – SSN, Aadhaar, 50+ country IDs, 20 API key formats, crypto wallets - Two modes: `/fast` (2-5ms) for structured PII, `/deep` (5-15ms) adds names/addresses via 2000+ name gazetteer

The technical approach:

Instead of ML inference, I use combined V8-optimized regex with heuristic pre-scanning. Clean text (90% of requests) skips pattern matching entirely. For IDs that require it, I implemented full checksum validation:

- Credit cards: Luhn - Indian Aadhaar: Verhoeff - Chinese ID: ISO 7064 Mod 11 - Brazilian CPF/CNPJ: Dual Mod 11 - IBAN: Mod 97

Runs on Cloudflare Workers (pure JS, no WASM), so no cold starts.

Why I'm sharing:

Enterprise PII solutions cost $50K+/year. I wanted to make this accessible to indie devs, startups, and anyone building AI features who doesn't want to become a data liability. The $5/mo tier covers most use cases.

Would love feedback on the detection coverage or edge cases I might be missing.

Comments

Raviteja_•1mo ago

Quick technical notes for HN:

Why no AI?

The irony of sending PII to an AI model to detect PII is lost on most "privacy" APIs. This is pure algorithmic detection – the same approach your credit card company uses to validate card numbers.

What's validated (not just pattern-matched): - Credit cards → Luhn checksum - Aadhaar → Verhoeff (the algorithm that catches single-digit and transposition errors) - IBAN → Mod 97 (same as banks use) - Singapore NRIC → Mod 11 with offset - Brazilian CPF → Dual Mod 11

Latency breakdown: - Heuristic scan: O(n) single pass for trigger characters (@, -, digits) - Pattern matching: Only runs if triggers found - Validation: Only on pattern matches - Total: 2-5ms for /fast, 5-15ms for /deep

False positive mitigation: - "Order ID: 123-45-6789" won't trigger SSN (negative context) - Timestamps won't match phone patterns (separator requirements) - Random 16-digit numbers won't trigger credit card (Luhn must pass)

max_aucube•1mo ago

The project is great, honestly. But I just put a space in the email by mistake, it wasn't censored.

Raviteja_•1mo ago

Great catch! Emails with spaces around @ (like "test @ example.com") slip through. This is a classic obfuscation bypass.

The current pattern intentionally matches RFC 5321 compliant emails (no spaces). Adding support for spaced variants creates a trade off. wewould catch more bypass attempts but also increase false positives on text like "send @ 5pm". I'll add this to the roadmap. Appreciate the feedback ! this is exactly the kind of edge case I need to hear about to make my api more better

comfytummyedgy•1mo ago

We integrated AI into our product recently and looking for few ways to protect our users data. Definitely going to check it out and try in our workflow.

McCLIM and 7GUIs – Part 1: The Counter

So whats the next word, then? Almost-no-math intro to transformer models

Ed Zitron: The Hater's Guide to Microsoft

UK infants ill after drinking contaminated baby formula of Nestle and Danone

Show HN: Android-based audio player for seniors – Homer Audio Player

Starter Template for Ory Kratos

LLMs are powerful, but enterprises are deterministic by nature

Make your iPad 3 a touchscreen for your computer

Internationalization and Localization in the Age of Agents

Building a Custom Clawdbot Workflow to Automate Website Creation

Why the "Taiwan Dome" won't survive a Chinese attack

Xkcd: Game AIs

Windows 11 is finally killing off legacy printer drivers in 2026

From Offloading to Engagement (Study on Generative AI)

AI for People

Rome is studded with cannon balls (2022)

8-piece tablebase development on Lichess (op1 partial)

US to bankroll far-right think tanks in Europe against digital laws

Ask HN: Have AI companies replaced their own SaaS usage with agents?

pi-nes

Show HN: Crew – Multi-agent orchestration tool for AI-assisted development

New hire fixed a problem so fast, their boss left to become a yoga instructor

Four horsemen of the AI-pocalypse line up capex bigger than Israel's GDP

A free Dynamic QR Code generator (no expiring links)

nextTick but for React.js

Show HN: I Built an AI-Powered Pull Request Review Tool

Git-am applies commit message diffs

ClawEmail: 1min setup for OpenClaw agents with Gmail, Docs

UnAutomating the Economy: More Labor but at What Cost?

Show HN: Gettorr – Stream magnet links in the browser via WebRTC (no install)