Every team I've worked on has the same problem. Someone makes a decision, usually a good one, with a lot of context behind it. And then, six months later, someone asks "why does this work this way?" and the answer is... gone. Buried in Slack. Lost in a PR description nobody remembers. Living only in the head of the person who wrote it, if they're still on the team.
In the age of agentic development this fundamental problem has only been exacerbated. The time it takes to code up an epic is no longer the long pole in the SDLC tent. Knowledge is being generated at exponential rates and no one seems to be able to keep up.
Distillery captures context inside the coding assistant via slash commands (/distill, /recall, /pour) and stores it in DuckDB with vector similarity search. No separate vector DB, no Postgres, a single file. Backup is cp. It is designed to be extensible to more datastores.
The feature that really excites me is ambient intelligence. You point it at GitHub repos, RSS feeds, subreddits, and it polls on a schedule, scores every item for relevance against your existing context using embedding similarity, and surfaces what matters. It builds a recency-weighted interest profile from what you've captured, so the more you use it, the better the signal-to-noise ratio gets.
Karpathy's LLM Wiki gist (https://gist.github.com/karpathy/442a6bf555914893e9891c11519...) describes essentially the same insight; persistent compounding knowledge rather than RAG's per-query rediscovery. Distillery differs in implementation: DuckDB + MCP tools instead of static markdown files, with ambient intelligence on top.
Technical choices: - Hybrid search — BM25 + vector similarity with Reciprocal Rank Fusion, so "DuckDBStore" finds the exact class and "how does storage work" finds the conceptual entries
- stdio transport for local, HTTP + GitHub OAuth for team shared context
- FastMCP for the MCP wire protocol — works with any MCP client, not just Claude Code
- Jina AI or OpenAI embeddings, pluggable via Protocol interface
- 3-tier semantic dedup (skip >0.95, merge 0.80–0.95, link 0.60–0.80)
- DuckDB + VSS extension (HNSW, cosine similarity) for storage + search in one file
- Chainguard Wolfi container, signed with Cosign, SBOM attested
- Run with `uvx distillery-mcp`
Blog post with more detail: https://norrietaylor.github.io/distillery/blog/building-a-se...
Happy to answer questions about the architecture, the feed scoring pipeline, or the experience of building a tool with itself in a week.