It's a Rust core with a Python CLI. One SQLite file stores everything -- text, 384-dim vector embeddings, JSON metadata, access tracking. No API keys, no cloud, no external vector DB.
What makes it different from Mem0/Engram/agent-recall:
- Hybrid search: FTS5 full-text + cosine vector search, fused with Reciprocal Rank Fusion. Text queries auto-vectorize -- no manual --vector flag needed.
- Auto-dedup: cosine similarity > 0.92 between same-type memories triggers an update instead of a new insert. Your agent can store aggressively without worrying about duplicates.
- Decay scoring: logarithmic access boost + exponential time decay (~69 day half-life). Frequently-used memories surface first; stale ones fade.
- Built-in embeddings: fastembed AllMiniLM-L6-V2 ships with the binary. No OpenAI calls.
- One-step setup: `memori setup` injects a behavioral snippet into ~/.claude/CLAUDE.md that teaches the agent when to store, search, and self-maintain its own memory.
Performance (Apple M4 Pro): - UUID get: 43µs - FTS5 text search: 65µs (1K memories) to 7.5ms (500K) - Hybrid search: 1.1ms (1K) to 913ms (500K) - Storage: 4.3 KB/memory, 8,100 writes/sec - Insert + auto-embed: 18ms end-to-end
The vector search is brute-force (adequate to ~100K), deliberately isolated in one function for drop-in HNSW replacement when someone needs it.
After setup, Claude Code autonomously:
- Recalls relevant debugging lessons before investigating bugs
- Stores architecture insights that save the next session 10+ minutes of reading
- Remembers your tool preferences and workflow choices
- Cleans up stale memories and backfills embeddings ~195 tests (Rust integration + Python API + CLI subprocess), all real SQLite, no mocking.
MIT licensed.
GitHub: https://github.com/archit15singh/memori
Blog post on the design principles: https://archit15singh.github.io/posts/2026-02-28-designing-cli-tools-for-ai-agents/