The problem it exists to solve: LLMs degrade as context fills up — this is measured, not anecdotal (Chroma Research). Every AI coding tool today runs a single agent start to finish. By phase 4 of your plan, half the context window is spent remembering what was already done. I was restarting sessions by hand, copy-pasting progress notes just to keep quality up. That's not autonomous development. That's babysitting. The architecture has four components: the Master Plan (your PRD, read fresh by every agent — never accumulates in context), the Baton (a hard-capped 40-line handoff note — the constraint is intentional, a bloated handoff recreates context rot in the next agent), the Signals (trigger phrases agents emit so the orchestrator dispatches without understanding the code), and the Context Budget (real-time token tracking, automatic handoff at threshold).
Tarvos is the reference implementation for Claude Code. Each session runs in its own git worktree. Accept merges to your branch, reject discards cleanly.
Open source, MIT. Works today, rough edges exist.