The basic idea is good. If a task is messy, you can split it into smaller threads, run them in parallel, and merge the results. For large refactors, multi-file debugging, or research implementation work, this can save a lot of time.
But I am starting to notice a strange manager effect.
I ask for a top-tier model because I want the best reasoning and engineering judgment. Instead, the main agent often seems to switch into "manager" mode and delegate most of the actual work to smaller models. The smaller model produces an okay solution, and then the expensive model turns it into a polished final answer.
So the smart model becomes a manager, the cheaper model does the heavy lifting, and the whole chain burns tokens. Ironically, the manager is often the most expensive part and the least useful part.
What I want from these tools is mostly transparency and control. I want to see which model handled which step. I want the option to disable delegation, or restrict it to certain subtasks. And if the tool is making a speed-versus-quality tradeoff, I want that tradeoff to be explicit.
Subagents are genuinely useful. But if I pay for a flagship model, I want flagship-level work, not a summary of a weaker model’s draft.
Curious whether others are seeing the same thing.