Every Claude Code session starts from zero, no memory of previous sessions. This server uses mem0ai as a library and exposes 11 MCP tools for storing, searching, and managing memories. Qdrant handles vector storage, Ollama runs embeddings locally (bge-m3), and Neo4j optionally builds a knowledge graph.
Some engineering details HN might find interesting:
- Zero-config auth: auto-reads Claude Code's OAT token from ~/.claude/.credentials.json, detects token type (OAT vs API key), and configures the SDK accordingly. No separate API key needed. - Graph LLM ops (3 calls per add_memory) can be routed to Ollama (free/local), Gemini 2.5 Flash Lite (near-free), or a split-model where Gemini handles entity extraction (85.4% accuracy) and Claude handles contradiction detection (100% accuracy).
Python, MIT licensed, one-command install via uvx.