Learning to Orchestrate Agents in Natural Language with the Conductor

https://openreview.net/forum?id=U23A2BUKYt

1•zaevlad•1h ago

Comments

zaevlad•1h ago

Sakana AI has presented their work “Learning to Orchestrate Agents in Natural Language with the Conductor,” which has been accepted to ICLR 2026. The idea is simple but powerful: instead of forcing a single model to handle an entire task on its own, the researchers trained a separate 7B model to act as a manager for other AIs.

This Conductor doesn’t write code or solve tasks directly. It looks at a problem and decides which agents to deploy, what subtask to give each one, and what context to provide. Essentially, it’s not just a router between models — it’s a meta-prompt engineer that assembles a working AI team tailored to a specific task.

What’s most interesting is that this behavior emerged not from hardcoded rules, but through reinforcement learning. For simple questions, the Conductor might rely on a single model call. For complex tasks, it builds a chain on its own: a planner, an executor, a verifier, and a correcting agent. It closely resembles how a strong team breaks down complex work into distinct roles.

The results look impressive. The 7B Conductor was able to outperform every individual model in its pool, including GPT-5, Gemini, Claude, and the open-source models available at the time of the research. The paper reports new state-of-the-art results on LiveCodeBench: 83.9%, and GPQA-Diamond: 87.5%. At the same time, the system proved cheaper than heavyweight multi-agent approaches like Mixture-of-Agents.

One standout feature is called Recursive Test-Time Scaling. The Conductor can select itself as one of the working agents, re-evaluate the output produced by its team, figure out where things went wrong, and assemble a new corrective workflow. In other words, scaling at inference happens not just by “thinking longer,” but by dynamically reconfiguring a new team in response to an error.

The key takeaway here isn’t just that there’s another multi-agent framework. What matters more is this: models are beginning to learn not only how to answer, but how to manage other models. Whereas AI systems used to be built around a single “smartest” agent, the focus is now shifting toward orchestration, roles, verification, and collective reasoning.

And it seems that Sakana is building its new multi-agent system, Sakana Fugu, precisely on this foundation.

immanuwell•1h ago

a tiny 7b model learning to boss around much bigger llms by figuring out who talks to whom and actually beating them - is genuinely wild

Ancient South Americans arrived in three waves–and had some surprising ancestry

What will the next plan for spam be?

The China-shocked towns are coming back?

An open-source spec for Codex orchestration: Symphony

You can beat the binary search

Super ZSNES – GPU Powered SNES Emulator

SharkMCP: A Swiss-knife MCP server for analysing PCAP files

Microsoft, OpenAI end exclusivity agreement, opening up potential partnerships

Show HN: Open-source tool to explore malware clusters and shared infrastructure

Taylor Swift files to trademark voice and image after AI concerns

Steam Controller Review

John Ternus says Apple has 'so much' opportunity to expand services

Open-Source KiCad PCBs for Common Arduino, ESP32, RP2040 Boards

Apple Store Union Staff at Closing Location Accuse Company of Retaliation

GitHub is having issues now

Show HN: RedSOC – 100% prompt injection success on AI SoC assistants

XiaomiMiMo/MiMo-v2.5

China blocks Meta from acquiring AI startup Manus

Cristina Barbosa for Lot

Show HN: Find people researching areas similar to you (HuggingFace)

Ask HN: Alternatives to GitHub Copilot?

Large Language Models Are Not Table Saws

Claude Code with Jupyter Notebooks via MCP

Tell HN: GitHub Filters Seem Broken

Alit: One Pic, 9 Textures

Git-based cache saves 50% on token usage

Every new car will be required put constant surveillance on the driver by 2027

Show HN: Alit – AI-generated PBR maps with real-time relighting

OpenAI Models Coming to AWS

Behind the system of private market for artifacts[video]