A few years ago everyone was using RAG, embeddings, databases on top of models. Now models with access to local markdown and memory files (like OpenClaw) seem to be readily outperforming these databases with grep and simple UNIX tools.
Is this an inherent issue in scaling LLMS? Does Obsidian work that much better for most people? It anyone finding anything that actually outperforms markdown?
At this point the main bottleneck in my adoption seems to be memory and persistent long term context, not quality or reliability of the models.
I'm curious if there are any technical or scaling metrics we could use to forecast where this will end up going.
kageroumado•57m ago
The full context then looks something like: [intro prompt] + [old exhanges lvl 1 summaries] + [larger system prompt] + [more recent exchanges lvl 0 summaries] + [temporal context] + [recent messages with tool results stripped] + [recent messages including tool results]
Tool results are progressively stripped because they are generally only useful for a few turns. This allows to keep everything we've ever done in the context, and the model can easily look up more information by expanding each node. It's a single perpetual session that never compacts during active work.
I find it outperforming every other solution I tried for my use case (personal assistant).