1. Long sessions cause context drift — the AI gradually ignores the original design 2. The AI writes fake tests — empty assertions, mocking the thing being tested 3. No research phase — the AI guesses how a framework works instead of reading the docs
OPC Workflow is my fix: three markdown files you put in your project and trigger as slash commands (/plan_sprint, /sprint, /audit).
The core mechanic is isolated sessions: - Planning happens in session A, then you close it - Development happens in session B, then you close it - Auditing happens in session C with zero knowledge of session B
The audit is the part I'm most proud of. It runs mutation testing — deliberately breaking each core function to verify the tests actually catch it. In my project, it found a module that directly instantiated components, bypassing the agent registry entirely. Security boundaries, tool injection, and the memory system were all silently failing. I had written both the code AND the tests. Confirmation bias is a real problem.
Real numbers: 7 sprints, 459 tests, 100% mutation capture rate, 1 critical bug found.
It works with Claude Code, Cursor, Kiro, and Antigravity. One-line install for mac/linux:
bash <(curl -sSL https://raw.githubusercontent.com/yoyayoyayoya/opc-workflow/main/install.sh)
And for Windows PowerShell: iex (iwr -useb 'https://raw.githubusercontent.com/yoyayoyayoya/opc-workflow/main/install.ps1').Content
Open to feedback, especially from people who've found other failure modes with AI coding tools.https://github.com/yoyayoyayoya/opc-workflow