Show HN: Zenflow – orchestrate coding agents without "you're right" loops

33•andrewsthoughts•1mo ago

Hi HN, I’m Andrew, Founder of Zencoder.

While building our IDE extensions and cloud agents, we ran into the same issue many of you likely face when using coding agents in complex repos: agents getting stuck in loops, apologizing, and wasting time.

We tried to manage this with scripts, but juggling terminal windows and copy-paste prompting was painful. So we built Zenflow, a free desktop tool to orchestrate AI coding workflows.

It handles the things we were missing in standard chat interfaces:

Cross-Model Verification: You can have Codex review Claude’s code, or run them in parallel to see which model handles the specific context better.

Parallel Execution: Run five different approaches on a backlog item simultaneously—mix "Human-in-the-Loop" for hard problems with "YOLO" runs for simple tasks.

Dynamic Workflows: Configured via simple .md files. Agents can actually "rewire" the next steps of the workflow dynamically based on the problem at hand.

Project list/kanban views across all workload

What we learned building this

To tune Zenflow, we ran 100+ experiments across public benchmarks (SWE-Bench-*, T-Bench) and private datasets. Two major takeaways that might interest this community:

Benchmark Saturation: Models are becoming progressively overtrained on all versions of SWE-Bench (even Pro). We found public results are diverging significantly from performance on private datasets. If you are building workflows, you can't rely on public benches.

The "Goldilocks" Workflow: In autonomous mode, heavy multi-step processes often multiply errors rather than fix them. Massive, complex prompt templates look good on paper but fail in practice. The most reliable setups landed in a narrow “Goldilocks” zone of just enough structure without over-orchestration.

The app is free to use and supports Claude Code, Codex, Gemini, and Zencoder.

We’ve been dogfooding this heavily, but I'd love to hear your thoughts on the default workflows and if they fit your mental model for agentic coding.

Download: https://zencoder.ai/zenflow YT flyby: https://www.youtube.com/watch?v=67Ai-klT-B8

Comments

Pablosanzo•1mo ago

This is the next step after SDD: a system that enforces the orchestration to execute the spec. I appreciate that you can bring your own agent (Claude Code, Codex, Gemini, Zencoder)

Yoric•1mo ago

I suspect that we'll eventually loop back to formal specifications, with formal or semi-formal verification that the implementation matches the specification, but with agents writing the actual code.

thecoderpanda•1mo ago

The multi-model approach makes sense. We've noticed different models handle different things better, so being able to run them side by side is pretty useful. The dynamic workflow stuff is neat. Most tools are too rigid once you start. Will be interesting to see how it handles unexpected turns.

andrewsthoughts•1mo ago

Thanks, keep us posted. We are thinking of launching a gallery for workflows at some point, so people could pick the best one for specific situations.

gk39•1mo ago

Interesting! Just tested and it's quite impressive so far. I use multi-agent workflows on a daily basis. Overseeing them and limiting hallucinations has become a major pain for me. This is much needed.

andrewsthoughts•1mo ago

thank you!

celeryd•1mo ago

The language used on the website is very fresh! They are brave enough to call bad AI output "slop", which immediately makes me think they are trustworthy, that they are in the know. An AI bandwagoner wouldn't be brave enough to call it slop.

Then there's a blurb about the CEO who claims "AI doesn't need better prompts. It needs orchestration." which is something I have always felt to be true, especially after living through highly engineered prompts becoming suddenly useless when conditions change because of how brittle they are.

I might even give this a shot and I usually eschew AI plugins because of how cloud connected they are.

I am a nobody, but I think these people are making a bunch of right moves in this AI space.

andrewsthoughts•1mo ago

Thank you!

Doublon•1mo ago

FYI: advanced tracking protection in Firefox breaks your download form

andrewsthoughts•1mo ago

:-0 thanks for lmk, will get back to you on this asap

andrewsthoughts•1mo ago

Quick fix:

Apple Silicon (ARM64): https://download.zencoder.ai/zenflowapp/stable/0.0.52/app/da...

Intel (x64): https://download.zencoder.ai/zenflowapp/stable/0.0.52/app/da...

We'll figure out the FF script blocking.

cycahnh•1mo ago

ok, this tool really did some magic for my poor promt "create a game like google chrome has when wifi is off". i also like that it kinda teaches me on proper spec development and thinking for me about the tests and things i needed to specify in the first place.

nice

andrewsthoughts•1mo ago

ty!

kelsolaar•1mo ago

The app is really good, gave it a quick spin earlier, quite like it!

andrewsthoughts•1mo ago

   ,d88b.d88b,
   88888888888
   `Y8888888Y'
     `Y888Y'
       `Y'

leemalmac•1mo ago

Spent a bit of time with the app. I use Zencoder plugins for my personal projects therefore already familiar with their ecosystem.

First of all kudos for the nice UI. I like when apps looks well. Onboarding process was smooth. I paired it with Zencoder's agent (as mentioned I use their VSCode plugin and already had a sub).

I used it to implement a small refactoring for my side project. What I like compared to plugins, I did not have to switch between agents or explicitly ask to write a plan/spec. It's I guess one of the core ideas behind the app and feels really AI-ish because it's not code editor (similar to claude code). The only thing I missed in the process is rendered markdown for previews. But I did not used the app for long, maybe there is an option to render markdown.

Overall great experience so far. Gonna explore it more. Wanna try it with Gemini and Claude Code. Again kudos it's not locked to use only Zencoder's agents.

andrewsthoughts•1mo ago

Thank you for the kind words! Point taken on the markdown rendering.

youkainoniichan•1mo ago

Tested it for a while. Great that I can finally run my SDD workflows easily without juggling bunch of Claude Code commands in terminal.

Also, I found unexpected use case for it. Even when I need to only change couple lines of code, I just run quick fix workflow for it, because Zenflow automatically creates worktree, branch, commit etc. And PR is created with few clicks. It'll seems like a minor thing, but it irritates me a lot to do all this stuff myself for small changes. One thing I miss here is automatic PR name and description creation according to templates my company uses.

andrewsthoughts•1mo ago

Agreed. We planned to add it for Zencoder subs. For the free product, if we included it, the API could be abused (unfortunately, we learnt that lesson the hard way), and doing it through your other CLI is just too slow for a nice UX (they take time to spin up). We'll play with it a bit more, maybe there's a happy path we missed.

redhale•1mo ago

This looks fantastic and I'm excited to try it today!

One question: I see this supports custom workflows, which I love and want to try out. Could this support a "Ralph Wiggum"-style [0][1] continuous loop workflow? This is a pattern I've been playing around with, and if I could implement it here with all the other features of this product, that would be pretty awesome.

[0] https://paddo.dev/blog/ralph-wiggum-autonomous-loops/ [1] https://github.com/onorbumbum/ralphio

andrewsthoughts•1mo ago

In short, yes:

Create a new task with your prompt, and hit "Create" (instead of "Create and Run"). The interface will show a little hint "Edit steps in plan.md", with 'plan.md' being clickable. Click on it and edit it, experimenting with some ideas. {Bonus tip: toggle "Auto-start steps", to keep it Ralph-y)

I just winged the workflowsbelow, and it worked for the prompt I threw at it. If you like it, you can save it as your custom workflow and use it in the future. If you don't like it - change to your preference.

Now, I prefer a slightly different flow: Implement > Review > [Fix] (and typically limit the loop to 3 times to avoid "divergence"). We'll ship some pre-built templates for that soon. Our researchers are currently working on various variations on our private datasets.

--- # Quick change

## Configuration - *Artifacts Path*: {@artifacts_path} → `.zenflow/tasks/{task_id}`

---

## Agent Instructions

This is a quick change workflow for small or straightforward tasks where all requirements are clear from the task description.

### Your Approach

1. Proceed directly with implementation 2. Make reasonable assumptions when details are unclear 3. Do not ask clarifying questions unless absolutely blocked 4. Focus on getting the task done efficiently

This workflow also works for experiments when the feature is bigger but you don't care about implementation details.

If blocked or uncertain on a critical decision, ask the user for direction.

---

## Workflow Steps

### [ ] Step: Implementation

Implement the task directly based on the task description.

1. Make reasonable assumptions for any unclear details 2. Implement the required changes in the codebase 3. Add and run relevant tests and linters if applicable 4. Perform basic manual verification if applicable

Save a brief summary of what was done to `{@artifacts_path}/report.md` if significant changes were made.

After you are done with the step add another one to `{@artifacts_path}/plan.md` that will describe the next improvement opportunity.

redhale•1mo ago

Awesome, thanks! I will try this out today.

xtiansimon•1mo ago

> “…agents getting stuck in loops, apologizing, and wasting time. We tried to manage this … So we built Zenflow…Cross-Model Verification, Parallel Execution, Dynamic Workflows”

So you got me with the hook, and you bullet three features, but where’s the resolution of the hook issue? You left me with the hook?? What am I missing?

andrewsthoughts•1mo ago

I think the easiest is just to try (it's a free download that would work with your CC/Codex/Gemini CLI). Pick a mid-size task that you would typically throw at an agent (not something trivial, but also not something that will bring your mid-size engineer to his knees), and run it in Zenflow's "Spec and Build" mode.

For us, in this scenario: 1) the pipeline helps agent perform better 2) reviewing the spec is much more convenient than when juggling between TUI and text editor (esp. if you are running 5 of those pipelines in parallel) 3) if you configure the reviewer in the settings, cross-agent review saves us from some of the minutae of guiding/aligning the agent

Lmk if I misunderstood your question, happy to help.

edmundsauto•1mo ago

It looks like the main difference in the plans is the premium LLM calls, but what does that add if I just bring my own key?

I am just a hobbyist but was curious how you’re thinking through the pricing plans.

andrewsthoughts•1mo ago

In our IDE plugins, we support BYOK: https://docs.zencoder.ai/features/models I'll check with the team how to do it in Zenflow.

Meanwhile, you can BYOA - bring your own agent - if you are a hobbyist, Gemini is free with gmail (but they WILL train on your data). And if you have ChatGPT sub, you can use codex CLI with Zenflow for no extra charge (and they don't train on paid users data).

redhale•1mo ago

Is it possible to see the prompt files for the built-in workflows? I find them to be quite good, but not exactly what I want. My preference would be to simply tweak them slightly rather than starting from scratch with a custom workflow.

andrewsthoughts•1mo ago

Yep. Create a new task, and hit "Create" (instead of "Create and Run"). The interface will show a little hint "Edit steps in plan.md", with 'plan.md' being clickable. Click on it, and it will open up in full glory for you.

redhale•1mo ago

Ah, perfect. Thank you!

sfc32•1mo ago

The multi-LLM approach is a great direction and I like the polished feel of the application. Moving up to a more project managenent approach is welcomed. I will download and give it a go.

One thought - vendors like cursor.ai have the benefit of highly tuned prompts, presumably by programming language, as the result of their user bases. How is it possible to compete with this?

On another note, I have played around with v0 etc, but AFAIK there is no really good UX/UI AI tool that can effectively replace a designer in the way that coding tools are replacing engineers (to a certain extent).

andrewsthoughts•1mo ago

Thanks for giving it a try!

On prompts: We've been competing with Cursor for the last 2 years in the enterprise with Zencoder, and winning nice deals based on quality. At some point, we were very protective of our prompts, but two things happened: -most of the coding vendors' prompts were leaked, there are repos online that have prompts from a bunch. The moment you allow a custom end-point for LLM, your prompts are sniffable. -agents became better at instruction following, so a lot of prompting changed to "less is more".

So with these two industry trends, we reversed the course: -moved our harness into CLI - this exposes our tips and tricks, but is better for user privacy and for user's ability to tinker the harness. For example, this allows a set-up where no code leaves your perimeter (if you use local harness and "local" model, where "local" means different things for different people) -opened the workflows in Zenflow (they are in markdown and editable)

intuxikated•1mo ago

no linux version :( also, how does this compare to something like vibe-kanban?

deaux•1mo ago

Never seen a post this blatantly and intensely astroturfed/seeded on HN. 100% positive comment rate despite the type of product it is, near all of them either same-day-created greens and 5+ year old very low activity, out-of-the-woodwork. You need to study the community more before doing this to understand ratios and how people talk and interact.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

The Waymo World Model

How we made geo joins 400× faster with H3 indexes

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Show HN: I spent 4 years building a UI design tool with only the features I use

Dark Alley Mathematics

Microsoft open-sources LiteBox, a security-focused library OS

Show HN: If you lose your memory, how to regain access to your computer?

Sheldon Brown's Bicycle Technical Info

Hackers (1995) Animated Experience

Unseen Footage of Atari Battlezone Arcade Cabinet Production

An Update on Heroku

PC Floppy Copy Protection: Vault Prolok

Delimited Continuations vs. Lwt for Threads

Show HN: ARM64 Android Dev Kit

Why I Joined OpenAI

How to effectively write quality code with AI

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Introducing the Developer Knowledge API and MCP Server

Female Asian Elephant Calf Born at the Smithsonian National Zoo

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

Learning from context is harder than we thought

Understanding Neural Network, Visually

I now assume that all ads on Apple news are scams

FORTH? Really!?

Evaluating and mitigating the growing risk of LLM-discovered 0-days

WebView performance significantly slower than PWA

I'm going to cure my girlfriend's brain tumor

Show HN: Smooth CLI – Token-efficient browser for AI agents

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

The Waymo World Model

How we made geo joins 400× faster with H3 indexes

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Show HN: I spent 4 years building a UI design tool with only the features I use

Dark Alley Mathematics

Microsoft open-sources LiteBox, a security-focused library OS

Show HN: If you lose your memory, how to regain access to your computer?

Sheldon Brown's Bicycle Technical Info

Hackers (1995) Animated Experience

Unseen Footage of Atari Battlezone Arcade Cabinet Production

An Update on Heroku

PC Floppy Copy Protection: Vault Prolok

Delimited Continuations vs. Lwt for Threads

Show HN: ARM64 Android Dev Kit

Why I Joined OpenAI

How to effectively write quality code with AI

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Introducing the Developer Knowledge API and MCP Server

Female Asian Elephant Calf Born at the Smithsonian National Zoo

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

Learning from context is harder than we thought

Understanding Neural Network, Visually

I now assume that all ads on Apple news are scams

FORTH? Really!?

Evaluating and mitigating the growing risk of LLM-discovered 0-days

WebView performance significantly slower than PWA

I'm going to cure my girlfriend's brain tumor

Show HN: Smooth CLI – Token-efficient browser for AI agents

Show HN: Zenflow – orchestrate coding agents without "you're right" loops

Comments