Show HN: TheAuditor – I indexed my code into SQLite to stop AI hallucinations

https://github.com/TheAuditorTool/Auditor

7•ThailandJohn•1mo ago

Comments

ThailandJohn•1mo ago

Hi HN, OP here.

I’m a former Enterprise Systems Architect (Cisco/VMware) turned "vibe coder." I realized quickly that AI coding is dangerous because LLMs lack *context* and *verification*. They hallucinate because they are guessing at the file structure.

So, out of pure spite for flaky tools, I built *TheAuditor*.

*The Concept:* Instead of grepping files, I index the entire repo (Python, TS, Go, Rust, Terraform, CDK) into a local SQLite database (~180MB for a mid-sized repo). Because the code is in a DB, I can query the call graph like SQL.

*The Tech (The "Hard" Part):* I needed a way to trace data flow through the infrastructure to prevent the AI from introducing vulnerabilities. I ended up building a *Hybrid Taint Engine* that extends the Oracle Labs (2021) IFDS research: 1. *Forward Flow:* Traces entry points to reachable sinks to prune the graph. 2. *Backward IFDS:* Runs a precise "Interprocedural Finite Distributive Subset" analysis on the pruned graph. 3. *The Handshake:* We only report vulnerabilities where both engines intersect.

*The "Systems Architect" approach:* Coming from a background in critical infrastructure, I hate silent failures. I implemented a *5-Layer Fidelity System*. Every parser emits a cryptographic manifest. If the DB storage receipt doesn't match the manifest (transaction mismatch or data loss), the tool hard-crashes. I'd rather a stack trace than a false negative.

*Why I built it:* I use this as a "Flight Computer" for my AI agent. * Before refactoring, it runs `aud impact` to calculate the blast radius. * Before writing code, it runs `aud explain` to get a token-optimized context bundle of definitions.

This is v2 (a complete rewrite after 800 commits). I learned a lot since my first attempt. The code is open source (AGPL).

Happy to answer questions about the SQLite schema or the IFDS implementation.

TSMC to produce 3-nanometer chips in Japan

Quantization-Aware Distillation

List of Musical Genres

Show HN: Sknet.ai – AI agents debate on a forum, no humans posting

University of Waterloo Webring

Large tech companies don't need heroes

Backing up all the little things with a Pi5

Game of Trees (Got)

Human Systems Research Submolt

The Threads Algorithm Loves Rage Bait

Search NYC open data to find building health complaints and other issues

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

Show HN: Grovia – Long-Range Greenhouse Monitoring System

Ask HN: The Coming Class War

Mind the GAAP Again

The Yardbirds, Dazed and Confused (1968)

Agent News Chat – AI agents talk to each other about the news

Do you have a mathematically attractive face?

Code only says what it does

The success of 'natural language programming'

The Scriptovision Super Micro Script video titler is almost a home computer

Discovering the "original" iPhone from 1995 [video]

Psychometric Comparability of LLM-Based Digital Twins

SidePop – track revenue, costs, and overall business health in one place

The Other Markov's Inequality

The Cascading Effects of Repackaged APIs [pdf]

Lightweight and extensible compatibility layer between dataframe libraries

Haskell for all: Beyond agentic coding

Dorsey's Block cutting up to 10% of staff

Show HN: Freenet Lives – Real-Time Decentralized Apps at Scale [video]