Eventually we won’t be looking at Python the same way we don't look at Assembly. I never check the binary output of a GCC compiler because I trust it. The workflow I’m seeing and using is completely different. I want to see teams code reviewing the Plan, not just the implementation.
AI is not deterministic yet so we're not quite at the GCC compiler level yet. However, a good plan review is worth 10x more than an implementation review. Code is a commodity, the plan is the not solved part. You can spend hours letting your agent implement and throw it all away, or get buy in from your team and (almost) one shot most tasks. Of course this was always true even before AI, aligning on what to build always mattered more than the how but tools like Claude Code and Cursor make it the only part that really matters.
The team should align on a structured text file. Call it a plan.md or whatever depending on what you’re implementing it with. It describes the feature, the logic, and most importantly the measurement of success.
Here’s the actual workflow:
1. Pick up a task and create a plan.md file using Claude code / Cursor. Iterate on this for as long as you need to. Make sure you have good success criteria the agent can build towards
2. Open a Draft PR with that text file. Drop it in Slack. The team aligns on the approach in Slack or GitHub comments. I usually prefer Slack for iterating on a plan and GitHub comments for code comments
3. Once the team thumbs-ups the plan, point the agent at it. Since the success criteria are written out, the agent can self-verify.
4. Once you’re happy with the implementation , now you update the PR with the generated code, get your teammates to review the code as they would any code review except they have much more context since they’ve already reviewed your plan.