frontpage.

Show HN: Engram update – 92% DMR, hosted API, lessons shipping agent memory

https://github.com/tstockham96/engram

1•tstockham•2h ago

About two weeks ago, I posted Engram here, a memory layer for AI agents. The response was great and pushed me to keep building. Here's where things stand.

What changed since the last post:

DMR benchmark: 92.0% accuracy (460/500). Retrieval hit rate is 96.4%. This is competitive with systems backed by graph databases and Python ML stacks. Engram is TypeScript + SQLite.

LOCOMO benchmark (long-conversation memory): 80.0% across all 10 conversations, 1,540 questions. Full context scores 88.4% but costs 30x more tokens.

Bi-temporal memory model. Every memory has valid_from/valid_until timestamps. Point-in-time recall via asOf parameter. Contradiction detection automatically supersedes stale facts.

Hosted API launched on Fly.io with Stripe billing. Self-hosting remains free (bring your own Gemini key). Hosted tiers start at $29/mo.

OpenAI-compatible base URL. One env var to use Groq, Cerebras, Ollama, or any OpenAI-compatible provider instead of Gemini.

70 tests passing. Published engram-sdk@0.5.5 on npm.

What I learned: Benchmark scores are fragile. 13 commits to my core vault module dropped LOCOMO from 84.5% to 62%. I had to treat the eval suite like a regression test, run it after every meaningful change. If you're building a memory/RAG system and not doing this, you're flying blind.

The judge LLM matters more than you'd think. Switching from one model to another as the benchmark judge changed scores by 10+ points on the same data. Always disclose your judge model. We use Gemini 2.5 Flash.

Temporal context is everything. Memories without timestamps are almost useless for "when" questions. Prefixing memories with their conversation date and teaching the LLM to resolve relative dates ("yesterday," "last week") was the single biggest accuracy improvement.

The API is the product, not the SDK. 95% of users will hit a REST endpoint, not import a TypeScript module. I wish I'd built the hosted API sooner.

What's next: LangChain/CrewAI integrations, an Engram skill for OpenClaw agents, and getting the academic paper on arXiv.

Happy to answer questions about benchmarks, architecture, or the experience of building this as a PM who codes. GitHub: https://github.com/tstockham96/engram Site: https://engram.fyi npm: https://www.npmjs.com/package/engram-sdk Hosted API: https://engram-hosted.fly.dev

Show HN: Bashd – Helper scripts for bulk CLI file management

No-backprop SNN scores 98.2% on Split-MNIST task-incremental, age 14

Major data leak forum dismantled in international cybercrime operation

Show HN: Scout-and-Wave – Parallel agent coordination via prompts

New RAGLight feature: deploy a RAG pipeline as a REST API with one command

Monday CEO "If you think about any company, 90% of the context isn't documented"

The Best AI Tools That Respect Your Privacy

Agent frameworks are solving the wrong problem

Ask HN: Will using LinkedIn with OpenClaw get me banned?

A taxonomy of text output (from tools that want to be too clever)

Ask HN: Will using WhatsApp with OpenClaw get my account banned?

Who Writes the Bugs? A Deeper Look at 125,000 Kernel Vulnerabilities

The uncomfortable truth about getting people off US tech

Eight Sleep raises $50M at $1.5B valuation

Show HN: Non-Human Assistant with near AGI capabilities

Show HN: QLoRA fine-tuning in .zse INT4 format by ZSE

Bluesky's Firehose in 3D

Tape as Context

Stockpile witholding funds, support is non-existent

How Tech Turned Against Women

Show HN: CodexBar for Android – Monitor Claude quotas on your phone

Lovable Alternative for Product Teams

DARPA to develop biological chips for low-power AI training at the edge

The Taxonomy of Pointers (2025)

Why we'll never see a yokozuna in the Super Bowl

The Art of Formatting Code (2025)

The Prolific Output of Wes McKinney in the Age of Agentic Engineering

Brian Cox: The terrifying possibility of the Great Filter [video]

Ask HN: What are your favorite debugging techniques?

Netlabs – Videos on Routing and Cisco