Benchmark: 7 tasks on FastAPI (the OSS repo, ~800 Python files), 3 runs/task/arm, 42 total runs, Claude Sonnet 4.6, both arms in --strict-mcp-config isolation. Without graph: ~23 tool calls, ~40K input tokens, 504 output tokens, $0.78/task With graph: ~2.3 tool calls, ~8K input tokens, 189 output tokens, $0.33/task The 58% cost reduction and 22% speed improvement were expected. The 63% output token reduction was not. When Claude gets 40K tokens of context (most irrelevant), it generates a lot of "let me look at this file... I can see that..." narration while it orients itself. When it gets 8K tokens of pre-filtered, graph-ranked context, it skips straight to the answer. The exploration filler disappears. This seems like a general property of these models: noisy input → verbose output, focused input → focused output. I'd be curious if others have observed this in different contexts.
The approach: tree-sitter AST parsing → dependency graph in SQLite → single MCP tool (run_pipeline) that takes a task description, walks the graph, returns ranked context. Full source for high-centrality pivot nodes, compact skeletons for supporting code. Savings varied by task type — code understanding tasks saved the most (-64%), bug fixes the least (-30%). Makes sense: the more exploration a task normally requires, the more waste there is to cut.
Code: the graph resolution is handwritten Rust. The MCP transport, SQLite schema, and benchmark harness were built with Claude Code (felt appropriate). The benchmark analysis scripts were 100% Claude.
Free tier at https://vexp.dev — 2K nodes, 1 repo, no time limit. Runs locally (tree-sitter + SQLite, no cloud).