1. Chat
2. Autocomplete
3. Embedding knowledge using RAG
4. Tool calling by LLMs (CLI or MCP)
5. Agentic LLMs executing task(s)
What do you see the next step or iteration?
My theory is that we will get more quantization and efficient models by the end of 2026 and my hope is that we will have mini models that wrap around tools (I call them domain agents) that just give answers without bloating context.
i.e. the Domain agent gives the calling agent the sausage but doesn't explain how the sausage was made.
Curious what your theories are, but I think we might need a whole rethink of the architecture of LLMs being combined with tools etc.