I’ve been using CLI-based agents in real-world full-stack projects, and I kept hitting the same wall: the "long-prompt fragility." As tasks get complex, agents start ignoring system rules, looping on trivial errors, or losing context mid-workflow.
Most people treat these as "model issues," but I started seeing them as orchestration issues. Instead of cramming every instruction into one massive prompt and hoping the model keeps it all in head, I built oh-my-ag. It’s an orchestration layer for Antigravity that enforces a structural collaboration between specialized agents.
The Core Idea: Process over Prompting Rather than a single "god-agent," oh-my-ag breaks down the workflow into explicit roles:
PM: Handles requirement decomposition and tasking.
Dev (FE/BE/Mobile): Implementation within a strictly scoped domain.
QA & Debug: One validates the requirements while the other analyzes failures.
Key Technical Features: Shared Memory (Serena): A persistence layer that keeps decisions and intermediate states consistent, even if you switch models mid-session.
Reduced Volatility: By splitting responsibilities, a minor model hallucination in implementation doesn't derail the entire PM-level task logic.
Parallel Execution: The orchestrator can trigger sub-agents simultaneously where appropriate.
Tooling: Built-in support for Gemini/Claude/Codex CLIs and MCP-scoped tool access.
You can try it with a single command: bunx oh-my-ag
It’s currently being used in production-level iterations for full-stack web and mobile projects (specifically those built on fullstack-starter). I'd love to hear your thoughts on how you're handling agentic "drift" in your own workflows.