One interesting challenge has been balancing recall speed vs. depth. Raw vector search is fast but misses context. Full graph traversal finds everything but kills latency. The tiered approach lets us start fast and go deeper only when needed.
Always curious to hear how others are tackling agent memory!
tylerrecall•1h ago
RecallBricks is plug-and-play memory infrastructure for AI agents. It lets agents store and retrieve durable context – preferences, decisions, feedback, and relationships – independently from the LLM or agent framework being used.
Most existing approaches treat memory as either raw vector search or framework-specific abstractions. That works for demos, but breaks down for long-running or multi-tool agents. We wanted something in between: structured memory with metadata, relationships, and lifecycle rules that persist across sessions and runs.
Under the hood, RecallBricks uses a multi-stage recall pipeline (fast heuristics → contextual retrieval → deeper reasoning when needed). This allows agents to retrieve relevant context without reloading everything into prompts, while keeping recall latency low using pgvector.
One meta detail: once it was usable, I connected Claude to RecallBricks via MCP. Claude now retains memory across the entire multi-month build of RecallBricks itself. I've been using RecallBricks to build RecallBricks.
This is early but live. People are already using it in agent workflows, and I'm actively refining how memories are ranked, linked, and decayed over time.
I'd love feedback from people building agents or long-running AI systems. What kinds of context do your agents lose today? Where do current memory patterns break down? What would make a separate memory layer not worth using?
Happy to answer questions and discuss tradeoffs.