What changed since the last post:
DMR benchmark: 92.0% accuracy (460/500). Retrieval hit rate is 96.4%. This is competitive with systems backed by graph databases and Python ML stacks. Engram is TypeScript + SQLite.
LOCOMO benchmark (long-conversation memory): 80.0% across all 10 conversations, 1,540 questions. Full context scores 88.4% but costs 30x more tokens.
Bi-temporal memory model. Every memory has valid_from/valid_until timestamps. Point-in-time recall via asOf parameter. Contradiction detection automatically supersedes stale facts.
Hosted API launched on Fly.io with Stripe billing. Self-hosting remains free (bring your own Gemini key). Hosted tiers start at $29/mo.
OpenAI-compatible base URL. One env var to use Groq, Cerebras, Ollama, or any OpenAI-compatible provider instead of Gemini.
70 tests passing. Published engram-sdk@0.5.5 on npm.
What I learned: Benchmark scores are fragile. 13 commits to my core vault module dropped LOCOMO from 84.5% to 62%. I had to treat the eval suite like a regression test, run it after every meaningful change. If you're building a memory/RAG system and not doing this, you're flying blind.
The judge LLM matters more than you'd think. Switching from one model to another as the benchmark judge changed scores by 10+ points on the same data. Always disclose your judge model. We use Gemini 2.5 Flash.
Temporal context is everything. Memories without timestamps are almost useless for "when" questions. Prefixing memories with their conversation date and teaching the LLM to resolve relative dates ("yesterday," "last week") was the single biggest accuracy improvement.
The API is the product, not the SDK. 95% of users will hit a REST endpoint, not import a TypeScript module. I wish I'd built the hosted API sooner.
What's next: LangChain/CrewAI integrations, an Engram skill for OpenClaw agents, and getting the academic paper on arXiv.
Happy to answer questions about benchmarks, architecture, or the experience of building this as a PM who codes. GitHub: https://github.com/tstockham96/engram Site: https://engram.fyi npm: https://www.npmjs.com/package/engram-sdk Hosted API: https://engram-hosted.fly.dev