I was building a proxy to strip PII from LLM API calls and realized
that zero-width Unicode characters break basically every PII filter out
there. If you stick a zero-width space inside a name like T om, Presidio's
NER model doesn't see it as a name anymore. Same thing with SSNs and phone
numbers against regex. So I built a normalization layer that strips all
that stuff before running detection.
The proxy itself is pretty simple. You swap your OpenAI base URL to point
at Veil and it redacts PII before the request leaves, then puts the real
values back in the response. Works with streaming too which was the hard
part honestly.
https://veil-api.com, free tier is 100 requests/month.