What should memory for agents actually be?
The basic idea is simple: coding agents are good at local edits, but they often miss the project-specific knowledge that experienced engineers carry around in their heads.
The memory itself is Markdown and Git-based. A source file can have a matching onboarding file. Route overviews describe larger areas. A ledger called memory.md maps code commits to memory commits, which gives an anchor between the memory repo and the code repo which are physically seperate in external mode. Some people don't want to have a huge amount of markdowns in their code repo. The ledger runs a lookup table so you can go back to earlier versions of that memory and still have synchronicity. Which is very helpful when you want to restore it from a bad state. This lookup table also allows you to run code and memory in dual worktrees and with that keep changes to the memory local until your feature or refactor etc. is clean and ready to merge. This protects your memory main from corruption. In other words it is like code and turned into a first class citizen. And it uses the same git mechanics to protect it.
With isolated work environments you also get seperate code graph and grepai instances using docker. Their memory is getting cloned with minimal changes so they map cleanly into the new environment. The cloning avoids re-indexing. So providers can be spun up and thrown away with the isolated environment.
For verifying memory every doc markdown file has a header that tracks the last known commit hash of the code file it is tracking. A simple script makes that way staleness detection cheap. This is one of the main reasons why I decided to use a path-mirrored documentation method. The documents mirror the same path but in a parallel folder. That makes not just staleness detection simple but also retrieval. The agent that opens a code file knows automatically where the document is and also has the assurance that the material is highly relevant.
Overview.md are more difficult to invalidate because they cover routes even the entire project. As the name says they give broader overviews which helps to get the gist. That broadness makes validation more challenging. But validation is still possible deterministically by using hot-paths within and script generated index files that monitor routes that change or larger file movements. So the model gets a clean deterministic signal and knows which parts of the overview files it has to update by pulling up git diffs or just looking into the file level markdowns that tell the story.
Another interesting part is the split of responsibility. The model should not have to manually track everything. It should reason with the developer, frame the problem, surface assumptions, compare options, and ask for the right approvals.
The deterministic work gets offloaded to an MCP server.
My system routes every session through a lifecycle:
request → trust check → reframe/research → decide → build → close
Before coding, the agent has to resolve context, check drift/provider state, reframe the task, gather evidence, and wait for developer agreement. Implementation approval is not commit approval. Commit, push, PR, merge, cleanup, and memory carryover are separate gates.A pattern that has become important recently is evidence accounting. For deeper research, the agent records what kind of evidence it used.
Is this still “AI memory” in the usual sense, or is it more like agent operating context?