I built Mnemora because every AI agent memory solution I evaluated (Mem0, Zep, Letta) routes data through an LLM on every read and write. At scale, that means 200-500ms latency per operation, token costs on your memory layer, and a runtime dependency you don't control.
Mnemora takes the opposite approach: direct database CRUD. State reads hit DynamoDB at sub-10ms. Semantic search uses pgvector with Bedrock Titan embeddings — the LLM only runs at write time to generate the embedding vector. All reads are pure database queries.
Four memory types, one API: 1. Working memory: key-value state in DynamoDB (sub-10ms reads) 2. Semantic memory: vector-searchable facts in Aurora pgvector 3. Episodic memory: time-stamped event logs in S3 + DynamoDB 4. Procedural memory: rules and tool definitions (coming v0.2)
Architecture: fully serverless on AWS — Aurora Serverless v2, DynamoDB on-demand, Lambda, S3. Idles at ~$1/month, scales per-request. Multi-tenant by default: each API key maps to an isolated namespace at the database layer.
What I'd love feedback on: 1. Is the "no LLM in CRUD path" differentiator clear and compelling? 2. Would you use this over Mem0/Zep for production agents? What's missing? 3. What memory patterns are you solving that don't fit these 4 types?
Happy to answer architecture questions.
SDK: pythonpip install mnemora
from mnemora import MnemoraSync
client = MnemoraSync(api_key="mnm_...") client.store_memory("my-agent", "User prefers bullet points over prose") results = client.search_memory("output format preferences", agent_id="my-agent") # [0.54] User prefers bullet points over prose Drop-in LangGraph CheckpointSaver, plus LangChain and CrewAI integrations.
Links: 5-min quickstart: https://mnemora.dev/docs/quickstart GitHub: https://github.com/mnemora-db/mnemora PyPI: https://pypi.org/project/mnemora/ Architecture deep-dive: https://mnemora.dev/blog/serverless-memory-architecture-for-...
isaacgbc•1h ago