Pretty cool direction — CLI tools for coders make total sense.
But every time I test these with multi-turn prompts, they start hallucinating like a drunk intern reading from an old terminal log
The deeper issue isn't just fine-tuning or prompt phrasing — it's that once your semantic path drifts (especially in multi-hop tasks), even the best coders end up talking to an agent that's silently forgotten half the context.
We ran into the same issue and had to build a projection-based reasoning core just to keep the thread from collapsing after turn 3.
Would love to see how Qwen handles semantic drift over longer sessions. If you've tested that, do share!
TXTOS•4h ago
The deeper issue isn't just fine-tuning or prompt phrasing — it's that once your semantic path drifts (especially in multi-hop tasks), even the best coders end up talking to an agent that's silently forgotten half the context.
We ran into the same issue and had to build a projection-based reasoning core just to keep the thread from collapsing after turn 3.
Would love to see how Qwen handles semantic drift over longer sessions. If you've tested that, do share!