While building our IDE extensions and cloud agents, we ran into the same issue many of you likely face when using coding agents in complex repos: agents getting stuck in loops, apologizing, and wasting time.
We tried to manage this with scripts, but juggling terminal windows and copy-paste prompting was painful. So we built Zenflow, a free desktop tool to orchestrate AI coding workflows.
It handles the things we were missing in standard chat interfaces:
Cross-Model Verification: You can have Codex review Claude’s code, or run them in parallel to see which model handles the specific context better.
Parallel Execution: Run five different approaches on a backlog item simultaneously—mix "Human-in-the-Loop" for hard problems with "YOLO" runs for simple tasks.
Dynamic Workflows: Configured via simple .md files. Agents can actually "rewire" the next steps of the workflow dynamically based on the problem at hand.
Project list/kanban views across all workload
What we learned building this
To tune Zenflow, we ran 100+ experiments across public benchmarks (SWE-Bench-*, T-Bench) and private datasets. Two major takeaways that might interest this community:
Benchmark Saturation: Models are becoming progressively overtrained on all versions of SWE-Bench (even Pro). We found public results are diverging significantly from performance on private datasets. If you are building workflows, you can't rely on public benches.
The "Goldilocks" Workflow: In autonomous mode, heavy multi-step processes often multiply errors rather than fix them. Massive, complex prompt templates look good on paper but fail in practice. The most reliable setups landed in a narrow “Goldilocks” zone of just enough structure without over-orchestration.
The app is free to use and supports Claude Code, Codex, Gemini, and Zencoder.
We’ve been dogfooding this heavily, but I'd love to hear your thoughts on the default workflows and if they fit your mental model for agentic coding.
Download: https://zencoder.ai/zenflow YT flyby: https://www.youtube.com/watch?v=67Ai-klT-B8
Pablosanzo•2h ago
Yoric•2h ago