The problem: I was building a local LLM assistant and needed to store conversation history + RAG documents. ChromaDB/Pinecone require running servers. FAISS has no persistence. I wanted something that just works like SQLite.
AgentKV is a single-file mmap database with: • Vector search (HNSW index) + graph relations • Crash recovery with CRC-32 checksums • Zero servers, zero config files • Thread-safe concurrent reads • pip install agentkv (Python 3.9+) Real examples in the repo: • Local RAG with Ollama (10 lines) • Multi-turn chatbot with memory • Agent coordination with context graphs
Built in C++20 with mmap + nanobind Python bindings. Benchmarked against FAISS — competitive throughput with built-in persistence.
Open to feedback on the API design and any use cases I'm missing!