I found myself constantly needing to pass complex codebases to LLMs for things like PRD generation, etc. Every time I paste a codebase into Claude I pay tokens for files the model doesn't care about. But if I only paste the relevant files, the model loses context about how everything fits together. It's an annoying tradeoff.
llmdoc is a small CLI that adds short LLM summaries for each file and intelligently updates them when the hash changes.
llmdoc annotate # Adds summaries for each file (respects .gitignore and you can configure it to ignore more)
llmdoc dump # Generates a handy "at a glance" summary to give to an LLM for complete context of your codebase.
There's also llmdoc check for CI — exits 1 if any annotation is stale or missing, no API key needed.
It supports Anthropic and OpenAI, works with 50+ languages, respects .gitignore, and has a --dry-run flag that estimates cost before touching anything.
A known issue is rate limiting for LLM providers, but because it all works with hashes, you can just rerun a few times to get it working.
Let me know what you think!