frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Learning to Orchestrate Agents in Natural Language with the Conductor

https://openreview.net/forum?id=U23A2BUKYt
1•zaevlad•1h ago

Comments

zaevlad•1h ago
Sakana AI has presented their work “Learning to Orchestrate Agents in Natural Language with the Conductor,” which has been accepted to ICLR 2026. The idea is simple but powerful: instead of forcing a single model to handle an entire task on its own, the researchers trained a separate 7B model to act as a manager for other AIs.

This Conductor doesn’t write code or solve tasks directly. It looks at a problem and decides which agents to deploy, what subtask to give each one, and what context to provide. Essentially, it’s not just a router between models — it’s a meta-prompt engineer that assembles a working AI team tailored to a specific task.

What’s most interesting is that this behavior emerged not from hardcoded rules, but through reinforcement learning. For simple questions, the Conductor might rely on a single model call. For complex tasks, it builds a chain on its own: a planner, an executor, a verifier, and a correcting agent. It closely resembles how a strong team breaks down complex work into distinct roles.

The results look impressive. The 7B Conductor was able to outperform every individual model in its pool, including GPT-5, Gemini, Claude, and the open-source models available at the time of the research. The paper reports new state-of-the-art results on LiveCodeBench: 83.9%, and GPQA-Diamond: 87.5%. At the same time, the system proved cheaper than heavyweight multi-agent approaches like Mixture-of-Agents.

One standout feature is called Recursive Test-Time Scaling. The Conductor can select itself as one of the working agents, re-evaluate the output produced by its team, figure out where things went wrong, and assemble a new corrective workflow. In other words, scaling at inference happens not just by “thinking longer,” but by dynamically reconfiguring a new team in response to an error.

The key takeaway here isn’t just that there’s another multi-agent framework. What matters more is this: models are beginning to learn not only how to answer, but how to manage other models. Whereas AI systems used to be built around a single “smartest” agent, the focus is now shifting toward orchestration, roles, verification, and collective reasoning.

And it seems that Sakana is building its new multi-agent system, Sakana Fugu, precisely on this foundation.

immanuwell•1h ago
a tiny 7b model learning to boss around much bigger llms by figuring out who talks to whom and actually beating them - is genuinely wild

Ancient South Americans arrived in three waves–and had some surprising ancestry

https://www.science.org/content/article/people-settled-south-america-three-distinct-waves-surpris...
1•paulpauper•42s ago•0 comments

What will the next plan for spam be?

https://broodnet.com/blog/post/what-will-the-next-plan-for-spam-be/
1•jtav_singular•44s ago•0 comments

The China-shocked towns are coming back?

https://www.nytimes.com/2026/04/20/opinion/america-manufacturing-recovery-china.html
1•paulpauper•1m ago•0 comments

An open-source spec for Codex orchestration: Symphony

https://openai.com/index/open-source-codex-orchestration-symphony/
1•xngbuilds•4m ago•0 comments

You can beat the binary search

https://lemire.me/blog/2026/04/27/you-can-beat-the-binary-search/
1•vok•4m ago•0 comments

Super ZSNES – GPU Powered SNES Emulator

https://zsnes.com/
1•haunter•6m ago•0 comments

SharkMCP: A Swiss-knife MCP server for analysing PCAP files

https://github.com/weirdmachine64/SharkMCP
1•zerodaysbroker•8m ago•0 comments

Microsoft, OpenAI end exclusivity agreement, opening up potential partnerships

https://www.tomshardware.com/tech-industry/microsoft-and-openai-end-exclusivity-agreement-opening...
1•thunderbong•10m ago•0 comments

Show HN: Open-source tool to explore malware clusters and shared infrastructure

https://malwaresiblings.up.railway.app/
1•hi2poc•10m ago•0 comments

Taylor Swift files to trademark voice and image after AI concerns

https://www.bbc.com/news/articles/crm1mygrmv2o
1•dryadin•10m ago•0 comments

Steam Controller Review

https://www.rockpapershotgun.com/steam-controller-review-2026
3•Tomte•11m ago•0 comments

John Ternus says Apple has 'so much' opportunity to expand services

https://9to5mac.com/2026/04/27/john-ternus-says-apple-has-so-much-opportunity-to-expand-services/
1•cdrnsf•11m ago•0 comments

Open-Source KiCad PCBs for Common Arduino, ESP32, RP2040 Boards

https://github.com/Hanqaqa/Easyduino
3•Hanqaqa•11m ago•0 comments

Apple Store Union Staff at Closing Location Accuse Company of Retaliation

https://www.bloomberg.com/news/articles/2026-04-27/apple-store-union-staff-at-closing-location-ac...
3•cdrnsf•14m ago•0 comments

GitHub is having issues now

https://www.githubstatus.com
6•SenHeng•14m ago•1 comments

Show HN: RedSOC – 100% prompt injection success on AI SoC assistants

https://github.com/krishnakaanthreddyy1510-cell/RedSOC
1•krishna3145•15m ago•0 comments

XiaomiMiMo/MiMo-v2.5

https://huggingface.co/XiaomiMiMo/MiMo-V2.5
1•Philpax•15m ago•0 comments

China blocks Meta from acquiring AI startup Manus

https://apnews.com/article/china-meta-manus-ai-acquisition-5f8012791f86f719a24a3ebac06d9b0a
1•hentrep•16m ago•0 comments

Cristina Barbosa for Lot

https://brand.lot-systems.com/Cristina_Barbosa_for_LOT.png
1•vadikmarmeladov•16m ago•1 comments

Show HN: Find people researching areas similar to you (HuggingFace)

https://foryu.me/
1•ymaws•16m ago•1 comments

Ask HN: Alternatives to GitHub Copilot?

1•baobabKoodaa•16m ago•0 comments

Large Language Models Are Not Table Saws

https://agentultra.com/blog/large-language-models-are-not-table-saws/index.html
1•speckx•18m ago•0 comments

Claude Code with Jupyter Notebooks via MCP

https://www.reviewnb.com/claude-code-with-jupyter-notebooks
1•amirathi•18m ago•0 comments

Tell HN: GitHub Filters Seem Broken

2•philip1209•19m ago•0 comments

Alit: One Pic, 9 Textures

https://alit.dev/?s=qgxj07ms
1•kulesh•19m ago•0 comments

Git-based cache saves 50% on token usage

https://old.reddit.com/r/vibecoding/comments/1sx4agk/gitbased_cache_saves_50_on_token_usage/
2•syumei•20m ago•0 comments

Every new car will be required put constant surveillance on the driver by 2027

https://twitter.com/pubity/status/2047994026029433116
1•bilsbie•20m ago•0 comments

Show HN: Alit – AI-generated PBR maps with real-time relighting

https://alit.dev/
1•kulesh•21m ago•0 comments

OpenAI Models Coming to AWS

https://twitter.com/ajassy/status/2048806022253609115
1•ke4qqq•24m ago•0 comments

Behind the system of private market for artifacts[video]

https://www.youtube.com/watch?v=lr7Bb93-ZaE
1•nalinidash•25m ago•0 comments