I built this in ~3 days because every AI coding tool lets the AI decide whether to run tests. Forge flips it: deterministic code orchestrates the AI, not the other way around. Your existing lint/test/coverage scripts become hard gates. AI writes code, your scripts decide if it ships.
Provider-agnostic — ships with Codex SDK, Claude Agent SDK, Claude Code, and Ollama adapters. Forge built itself (the repo is the proof that the code produced is somewhat ok).
Happy to answer questions about the architecture or get roasted on the code quality.