The token problem: agents read entire files linearly to build context. On a medium TypeScript project, a single query was consuming ~18k tokens — most of it irrelevant. vexp builds a dependency graph from the AST (who calls what, who imports what, what types flow where) and serves only the relevant subgraph as a token-budgeted capsule. ~2.4k tokens instead of ~18k, with better response quality because the context is precise.
The memory problem: this is where it gets interesting. The obvious approach is giving agents a "save what you learned" tool. They won't use it. I tried every prompting trick. Agents optimize for task completion, not knowledge retention. The incentive structure is fundamentally wrong.
So vexp observes passively. It watches what happens — which symbols the agent explored, which files changed and how they changed structurally, what patterns emerge across sessions — and builds memory without the agent lifting a finger. When code changes, linked memories auto-stale. The agent sees "previous context exists but the code has changed since re-evaluate." It also catches anti-patterns like dead-end exploration and file thrashing so the agent doesn't repeat mistakes.
The memory is hybrid-searched with 5 signals (text relevance, semantic similarity, recency, code graph proximity, staleness) and every result includes a "why" field explaining the ranking. No black box.
Architecture: single native Rust binary (~15MB), SQLite with WAL mode, tree-sitter for 11 languages, MCP protocol. 100% local, zero cloud, zero account, zero network calls. Works with Claude Code, Cursor, Copilot, Windsurf, Zed, Continue, and 6 other agents. Auto-detects which agent is running and generates tailored instruction files.
Free tier: 2k nodes + all memory tools. Everything runs on your machine.