Classic failure mode:
User (day 1): “My favorite editor is VSCode” User (day 47): “Actually switched to Cursor” Agent (day 92): “You love VSCode, right?”
Or the memory index slowly becoming this:
User prefers Python User REALLY prefers Python Python is basically the user's personality
All equally weighted. Forever.
Vector search is great at similarity, but terrible at current truth.
Human memory fades old opinions and overwrites bad takes. My agent? Eternal archive of everything I ever said.
So I built MemX as a small experiment: what happens if an agent’s memory actually has a lifecycle?
Things like:
importance signals
frequency reinforcement
duplicate compression
explicit updates / superseding
gentle decay for outdated facts
Stack is intentionally boring:
SQLite
FAISS
some scoring logic pretending to be a “memory OS”
Tiny example:
from memx import MemX
m = MemX()
m.add("User's favorite editor: VSCode") m.add("User switched to Cursor")
m.compress() m.rag("what editor should I use?")
Quick benchmark (50 noise + 5 real memories):
Vector RAG recall@3: ~0.75 MemX recall@3: ~1.0
Longer simulation:
10k interactions → ~9k memories
Latency on my laptop:
10k memories → ~0.3ms
100k → ~3ms
1M → ~30ms
This isn't trying to replace vector search or be the next agent framework. It's just an experiment in whether agents behave better if memory has a lifecycle instead of being a permanent document archive.
Curious how others handle contradictory or evolving memories in long-running agents.
Repo: https://github.com/mohitkumarrajbadi/memx
(And yes, the capsicum thing is real. The agent still argues about it.)
davydm•1h ago
But I like your idea - minds evolve, and part of the problem with llms is that they're largely static in nature - no evolution, no upskilling, just wait for the next model to come out. I understand the ramifications of learning all the time from user input (especially possible poisoning), but have always felt this is an area that could improve a lot.