I built CodeGraph CLI because I was tired of grep-ing through massive codebases trying to understand how things work.
It combines three things: - tree-sitter (AST parsing, error-tolerant) - SQLite (dependency graph: nodes + edges) - LanceDB (vector embeddings, disk-based)
The key insight: pure vector search misses structural relationships. So I combined vector search with BFS graph traversal — find semantically similar code, then expand to dependencies/dependents.
Result: ask "how does authentication work?" and it finds validate_token(), its caller login_handler(), AND the dependency TokenStore — because it understands both meaning AND structure.
Other features: - Impact analysis (multi-hop BFS: what breaks before you change it) - Multi-agent system via CrewAI (4 specialized agents) - Visual code explorer (browser-based) - Auto-generate docs/READMEs - 100% local-first (works with Ollama, zero data leaves machine) - 6 LLM providers (Ollama, OpenAI, Anthropic, Groq, Gemini, OpenRouter) - 5 embedding models (from zero-dependency hash to 1.5B code model)
Quick start: pip install codegraph-cli cg config setup cg project index ./your-project cg chat start
MIT licensed. Python 3.9+.
Happy to answer questions about the graph-augmented RAG architecture or any technical decisions.