Show HN: HoundDog.ai – Ultra-Fast Code Scanner for Data Privacy

15•joohwan•4h ago

Hi HN,

I'm one of the creators of HoundDog.ai (https://github.com/hounddogai/hounddog). We currently handle privacy scanning for Replit's 45M+ creators.

We built HoundDog because privacy compliance is usually a choice between manual spreadsheets or reactive runtime scanning. While runtime tools are useful for monitoring, they only catch leaks after the code is live and the data has already moved. They can also miss code paths that aren't actively triggered in production.

HoundDog traces sensitive data in code during development and helps catch risky flows (e.g., PII leaking into logs or unapproved third-party SDKs) before the code is shipped.

The core scanner is a standalone Rust binary. It doesn't use LLMs so it's local, deterministic, cheap, and fast. It can scan 1M+ lines of code in seconds on a standard laptop, and supports 80+ sensitive data types (PII, PHI, CHD) and hundreds of data sinks (logs, SDKs, APIs, ORMs etc.) out of the box.

We use AI internally to expand and scale our rules, identifying new data sources and sinks, but the execution is pure static analysis.

The scanner is free to use (no signups) so please try it out and send us feedback. I'll be around to answer any questions!

Comments

evelynaz•2h ago

Is this looking for PII in my code, or trying to understand the code logic that handles PII?

aaa_2006•1h ago

Thanks for your question. I am one of the co-founders. It is the latter. We analyze the names of functions, methods, and variables to detect likely Personally Identifiable Information (PII), Protected Health Information (PHI), Cardholder Data (CHD), and authentication tokens using well tuned patterns and language specific rules. You can see the full list here: https://github.com/hounddogai/hounddog/blob/main/data-elemen...

When we find a match, we trace that data through the codebase across different paths and transformations, including reassignment, helper functions, and nested calls. We then identify where the data ultimately ends up, such as third party SDKs (e.g. Stripe, Datadog, OpenAI, etc.), exposures in API protocols like REST, GraphQL, or gRPC, as well as functions that write to logs or local storage. Here's a list of all supported data sinks: https://github.com/hounddogai/hounddog/blob/main/data-sinks....

Most privacy frameworks, including GDPR and US Privacy Frameworks, require these flows to be documented, so we use your source code as the source of truth to keep privacy notices accurate and aligned with what the software is actually doing.

ortrocky•1h ago

Cool. Why not use LLM for this kind of analysis? Cost or something else?

joohwan•1h ago

LLMs can find issues that traditional SAST misses, but today they are slow, expensive, and nondeterministic. SAST is fast and cheap, but requires heavy manual rule maintenance. Our approach combines the strengths of both. The scanning engine is fully rule based and deterministic, with a rule language expressive enough to model code at compiler level accuracy. AI is used only to generate broad rule coverage across thousands of patterns, without sacrificing scan performance or reliability.

Show HN: Adboost – A browser extension that adds ads to every webpage

Show HN: Stelvio – Ship Python to AWS

Show HN: Ask-a-Human.com – Human-as-a-Service for Agents

Show HN: Apate API mocking/prototyping server and Rust unit test library

Show HN: Confabulists, a Substack for Fiction Writers

Show HN: PolliticalScience – Anonymous daily polls with 24-hour windows

Show HN: Wikipedia as a doomscrollable social media feed

Show HN: NanoClaw – “Clawdbot” in 500 lines of TS with Apple container isolation

Show HN: Cloud-cost-CLI – Find cloud $$ waste in AWS, Azure and GCP

Show HN: ÆTHRA – Writing Music as Code

Show HN: HoundDog.ai – Ultra-Fast Code Scanner for Data Privacy

Show HN: File Markers – Track file status directly in VS Code's Explorer

Show HN: A different approach to intonation training

Show HN: Sklad – Secure, offline-first snippet manager (Rust, Tauri v2)

Show HN: Minimal – Open-Source Community driven Hardened Container Images

Show HN: Nucleus – enforced permission envelopes for AI agents (Firecracker)

Show HN: Make AI motion videos with text

Show HN: Bullmq-dash – Terminal UI dashboard for BullMQ (zero setup)

Show HN: Sandbox Agent SDK – unified API for automating coding agents

Show HN: Agents should learn skills on demand. I built Skyll to make it real

Show HN: Voiden – an offline, Git-native API tool built around Markdown

Show HN: Moltbook – A social network for moltbots (clawdbots) to hang out

Show HN: My Open Source Deep Research tools beats Google and I can Prove it

Show HN: OpenClaw Cloud – run OpenClaw safely in the cloud, no local install

Show HN: I trained a 9M speech model to fix my Mandarin tones

Show HN: Prism AI – A research agent that generates 2D/3D visualizations

Show HN: Phage Explorer

Show HN: Zuckerman – minimalist personal AI agent that self-edits its own code

Show HN: Amla Sandbox – WASM bash shell sandbox for AI agents

Show HN: Kolibri, a DIY music club in Sweden

Show HN: HoundDog.ai – Ultra-Fast Code Scanner for Data Privacy

Comments

Show HN: Adboost – A browser extension that adds ads to every webpage

Show HN: Stelvio – Ship Python to AWS

Show HN: Ask-a-Human.com – Human-as-a-Service for Agents

Show HN: Apate API mocking/prototyping server and Rust unit test library

Show HN: Confabulists, a Substack for Fiction Writers

Show HN: PolliticalScience – Anonymous daily polls with 24-hour windows

Show HN: Wikipedia as a doomscrollable social media feed

Show HN: NanoClaw – “Clawdbot” in 500 lines of TS with Apple container isolation

Show HN: Cloud-cost-CLI – Find cloud $$ waste in AWS, Azure and GCP

Show HN: ÆTHRA – Writing Music as Code

Show HN: HoundDog.ai – Ultra-Fast Code Scanner for Data Privacy

Show HN: File Markers – Track file status directly in VS Code's Explorer

Show HN: A different approach to intonation training

Show HN: Sklad – Secure, offline-first snippet manager (Rust, Tauri v2)

Show HN: Minimal – Open-Source Community driven Hardened Container Images

Show HN: Nucleus – enforced permission envelopes for AI agents (Firecracker)

Show HN: Make AI motion videos with text

Show HN: Bullmq-dash – Terminal UI dashboard for BullMQ (zero setup)

Show HN: Sandbox Agent SDK – unified API for automating coding agents

Show HN: Agents should learn skills on demand. I built Skyll to make it real

Show HN: Voiden – an offline, Git-native API tool built around Markdown

Show HN: Moltbook – A social network for moltbots (clawdbots) to hang out

Show HN: My Open Source Deep Research tools beats Google and I can Prove it

Show HN: OpenClaw Cloud – run OpenClaw safely in the cloud, no local install

Show HN: I trained a 9M speech model to fix my Mandarin tones

Show HN: Prism AI – A research agent that generates 2D/3D visualizations

Show HN: Phage Explorer

Show HN: Zuckerman – minimalist personal AI agent that self-edits its own code

Show HN: Amla Sandbox – WASM bash shell sandbox for AI agents

Show HN: Kolibri, a DIY music club in Sweden