Is this a way to increase token burn?
I thought we covered this with Claude's C compiler. What changed?
I feel like there are more efficient ways to tackle the issues given.
https://blog.cloudflare.com/dynamic-workflows/
Also isn’t all of this already easy to do on any of the platforms (include Claude before this and OpenAI too).
It's telling that they used "rewrite Bun in Rust" as the proof point here. It's cool! But the vast majority of software engineering doesn't start with tens of thousands of tests, where making them pass is the whole job.
In my experience, AI still drifts from what I meant it to do on anything bigger than building a widget. My time is spent suspiciously reviewing output for changes the agent snuck in, or invariants it broke. I talked with a friend recently where the agent broke the test harness badly enough that none of the tests mattered for 3 weeks. They did pass, though, so CI never complained.
There's something at the intersection of context engineering, managing that sloppy pile of markdown plans, and good old fashioning system understanding that's the real bottleneck.
Is there an example of how y'all use Dynamic Workflows internally that you could share with the rest of us here so that we can mimic something similar?
1. Autonomously landed 20+ optimizations to reduce Claude Code's token usage by ~15%
2. Ported tree-sitter, color-diff, yoga-layout, and a number of other WASM and Rust native modules to TypeScript, improving CPU and memory use by 2-10x in the process
3. Made our CI faster, and repeatedly found and fixed flaky tests (with /loop)
4. Migrated from regex-based bash static analysis to tree-sitter, reducing false positive permission prompts by 45%
5. Reduced Claude Agent SDK startup time by 61%, by repeatedly profiling and optimizing the startup path, putting up a number of PRs in the process
6. Shipped 69 code simplification PRs, deleting >10k lines of code
I need more mechanisms for controlling long-running sessions and dynamically injecting my thoughts, correction, and nudges rather than faster ways to burn through my tokens without knowing if the results are going to be correct.
"Agents address the problem from independent angles, other agents try to refute what they found, and the run keeps iterating until the answers converge."
So you will be supplying the "ground truth" (test suite, detailed spec, whatever) and empower an agent to use it to guide the other agents. Currently a lot of people do this sequentially in the form of multiple code-review passes by fresh agent sessions looking at the work of previous sessions.
Adversarial models are a longstanding technique in ML so it makes sense they would try to go this way.
1. Support for 1-2 OOMs more agents, to do more work in parallel
2. A phased, semi-structured approach where work happens in steps
I did find it uses tokens like crazy, i migrated Pixel Dungeon (java) to C# as a experiment, and it used almost 2 billion tokens. It was just 20 bucks due to deepseek flash, but i shudder thinking of how much money this uses when run on the real claude API pricing.
I did port stb_image from C to Jai which i was able to fully verify and harden and that one ill give more use. Im also using the same workflow system to perform agentic translation of a game i work with from english to various other languages, the results are far better than the commercial "human" translation services we tested. And i also use it to fix OCR issues on PDF books im ocr-ing for a data pipeline. This kind of workflow/wide agent swarm system is rather useful for many things where you want to "apply" the same prompts across a whole codebase or just in parallel.
I’m at the point where deciding what we should and should not do takes a lot more time than actually doing it. More agents just means running faster in potentially the wrong direction
Its like you guys aren't even aware of the primary problem you are all facing: your token burns aren't paying off anyore against standard coding -- and looking net negative. I have to ask, are you this unaware of your core problem set here?
There are no any examples, proofs, or scenarios that show why there is improvement either in complexity or reliability of the solution or effeciency to the path of the solution. I'm baffled.
Maybe blasphemy, but will workflows be able to use non-Anthropic LLMs (e.g., delegating some steps to local models, but design and review by Claude)?
Something most models do, Claude Code included, is use three.js, which comes with many limitations compared to the what the rendering engines in native game engines can do and the accompanying plugins/toolsets they offer. However, the fast iteration to go from ideation to concept, to prototype, is invaluable.
My team is building a way to vibe code full featured Unreal Engine games, directly in the browser with publishing workflow straight to a browser. The games are then rendered in WebGPU and use WebAssembly for near-native performance. We think this pipeline and workflow will be transformational for the gaming industry.
Would love to show off what we have. You can DM me on X:
and got
API Error: 400 messages.3.content.11: `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.
Tried again in
Claude 1.9659.1 (193bcb) 2026-05-28T16:22:15.000Z also but may need a new chat
But each prompt will cost your company, 10 to 15 million dollars. An extra 20 million if you ask them to review the code and improve the comments.
It feels more like a bespoke build system for the specific task/project than prompting a freeform chat.
mil22•48m ago
Rewriting Bun with dynamic workflows
An example of what dynamic workflows can unlock at scale is the recent rewrite of Bun. Jarred Sumner used dynamic workflows to port Bun from Zig to Rust with 99.8% of the existing test suite passing, roughly 750,000 lines of Rust, and eleven days from first commit to merge. One workflow mapped the right Rust lifetime for every struct field in the Zig codebase. The next wrote every .rs file as a behavior-identical port of its .zig counterpart, hundreds of agents working in parallel with two reviewers on each file. A fix loop then drove the build and test suite until both ran clean. After the port landed, an overnight workflow addressed unnecessary data copies and opened a PR for each for final review. While not yet in production, all of this was handled by dynamic workflows. Jarred will be writing about this more in the future.
SkyPuncher•34m ago
Mechanical refactors are relatively straight forward for agents.