Why I built this: I needed a way to verify AI-generated code was production-safe. Existing tools either required cloud uploads (privacy concern) or produced output too large for AI context windows. TheAuditor solves both problems - it runs completely offline and chunks findings into 65KB segments that fit in Claude/GPT-4 context limits.
What I discovered: Testing on real projects, TheAuditor consistently finds 50-200+ vulnerabilities in AI-generated code. The patterns are remarkably consistent: - SQL queries using f-strings instead of parameterization - Hardcoded secrets (JWT_SECRET = "secret" appears in nearly every project) - Missing authentication on critical endpoints - Rate limiting using in-memory storage that resets on restart
Technical approach: TheAuditor runs 14 analysis phases in parallel, including taint analysis (tracking data from user input to dangerous sinks), pattern matching against 100+ security rules, and orchestrating industry tools (ESLint, Ruff, MyPy, Bandit). Everything outputs to structured JSON optimized for LLM consumption.
Interesting obstacle: When scanning files with vulnerabilities, antivirus software often quarantines our reports because they contain "malicious" SQL injection patterns - even though we're just documenting them. Had to implement pattern defanging to reduce false positives.
Current usage: Run aud full in any Python/JS/TS project. It generates a complete security audit in .pf/readthis/. The AI can then read these reports and fix its own vulnerabilities. I've seen projects go from 185 critical issues to zero in 3-4 iterations.
The tool is particularly useful if you're using AI assistants for production code but worry about security. It provides the "ground truth" that AI needs to self-correct.
Would appreciate feedback on: - Additional vulnerability patterns common in AI-generated code - Better ways to handle the antivirus false-positive issue - Integration ideas for different AI coding workflows
Thanks for taking a look! /TheAuditorTool
quibono•2h ago
That's a strange ask in the Python ecosystem - what's the reason for this?
Also, what's the benefit of ESLint/Ruff/MyPy being utilised by this audit tool? I'm not sure I understand the benefit of having an LLM in between you and Ruff, for example.
ffsm8•2h ago
It's breathtaking how much of an enabler it already is, but curating a good dependency tree and staying within scope of the outlined work to do are not things LLMs are good at, currently.
TheAuditorTool•2h ago
The ESLint/Ruff/MyPy integration isn't about putting an LLM between you and linters. It's about aggregation and correlation. Example: - Ruff says "unused import" - MyPy says "type mismatch" - TheAuditor correlates: "You removed the import but forgot to update 3 type hints that depended on it"
The LLM reads the aggregated report to understand the full picture across all tools, not just individual warnings.
@ffsm8: You're absolutely right - I can't code and the dependency tree is probably a mess! That's exactly WHY I built this. When you're using AI to write code and can't verify if it's correct, you need something that reports the ground truth.
The irony isn't lost on me: I used Claude to build a tool that audits code written by Claude. It's enablement all the way down! But that's also the proof it works - if someone who can't code can use AI + TheAuditor to build TheAuditor itself, the development loop is validated.
The architectural decisions might be weird, but they're born from necessity, not incompetence. Happy to explain any specific weirdness!