If you use Claude Code on anything bigger than a small script, you've hit the token limits faster than you'd like, or experienced context rot mid-session. The agent loses track of files, forgets architecture decisions, starts making things up. I was hitting this every single day.
DevSquad is a Claude Code plugin, not a framework, not a new CLI. You install it, and it hooks into Claude Code's execution to delegate subtasks to Gemini and Codex. When context gets heavy, specific work gets offloaded: tests to one model, docs to another, refactoring to a third.
For a quick proof of concept, I initially made a Claude.md that did the job, but I had to often remind it and it wasn't predictable or reliable. Hence, hook-enforced delegation, not polite suggestions via Claude.md. The plugin intercepts at defined trigger points and routes work through structured handoffs. Each agent operates in a focused context window instead of everyone fighting over one.
I know what you're thinking: there have been a dozen agent orchestrators on Show HN this month. Here's why I built another one: all of those require you to set up a new environment. Docker, new CLI, YAML configs. DevSquad hooks into the tool you already use.
How I built it: I'm a product person, not a developer. I led product at an AdTech company for years but never wrote production code. DevSquad was built entirely with AI coding tools — Claude Code, Google Antigravity, and yes, DevSquad itself once the early version worked. Make of that what you will.
Two questions I'd genuinely love HN's take on: How are you handling Claude Code's token limits on large projects? And — right now this is built for vibe coders. Does it matter if the person who built a tool can't write the code by hand, as long as the tool works?