We've been bitten by this a few times — real email addresses ending up in CSV test fixtures that got committed to repos. Curious how data engineering teams actually handle this in practice.
Do you gate on it in CI? Manual review? Just trust the process?
We built a small local CLI scanner for this — deterministic pattern matching, no network calls, exits non-zero on HIGH risk findings so you can block PRs. Happy to share if useful but mostly curious what others are doing.
dk970•1h ago