frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Scaling tool orchestration data will emerge different intelligence and LLMs

2•arkariarn•1h ago
Tldr: We are only now gonna start to scale long term external orchestration, everything beforehand was mostly internal problem solving training with here and there a tool call. We don't actually know yet what scaling orchestration training produces. It might produce much better tool-using assistants that remain fundamentally reactive to human instructions. Or it might produce something with more emergent autonomy. My gut feeling tells me the second. For the first time I foresee in the near future (as soon as 2027-2028) a potential for a misaligned takeoff.

A year ago, a friend of mine who studied social science asked my opinion about AI 2027 and the prospect of a misaligned AI takeover. I laughed and said it was quite impossible given how the technology actually worked. An LLM works too stepwise, I told him. There's a prompt, the model predicts the next tokens, and then it "dies." There's no continuity between prompts — it can store some text in a database, but there's no persistent reasoning. It felt obviously safe.

With the recent agentic developments of the past few months, I'm starting to doubt that earlier understanding.

The first generation of LLMs, up through GPT-4, were essentially sophisticated text autocompleters. They were trained on internet data from web crawls, fine-tuned with RLHF to give them a chatbot flavor. They felt harmless, and they fit the description I gave my friend perfectly. Their capabilities were entirely bounded by the context window and the prompt-answer time window. Prompt in, completion out, done.

The second generation added reasoning capabilities. These models stopped feeling like pure autocompleters — they could search within their stored knowledge, chain thoughts together, and work through problems. The training data changed too: successful reasoning traces got folded back into training. But crucially, they were still bounded by the same constraints. They got more time to think and process, but at the end of the answer, they were still mostly gone. The capability was still internal to the model.

Now enter this third generation of agentic LLMs, which really took off with tools like Claude Code becoming increasingly capable. These don't feel like autocompleters. They don't even feel like reasoners. They're starting to feel like orchestrators. They aren't limited to their internals — they act as a connected system, coordinating tools and externals to achieve goals. What scares me most is the new type of training data we're now generating and collecting: succesful long term orchestration traces. They will allow us to scale orchestration kind of intelligence. This kind of intelligence is not bound to its internal. It changes to an external symbiotic type of intelligence. We are training them to externalize almost everything. And optimizing them to orchestrate all these externals over a long time. This feels like optimizing for a symbiotic system, very different from the simple internally optimized llms of today. It really feels like the equation of what the llm is processing, is changing. The llm becomes an orchestration engine of externals, which together make up the whole system. We know how reasoning autocompletion scales, we dont know how orchestration engines scale. I feel like different and new emergent capabilities might appear. We are basically for the first time scaling the prefrontal cortex of llms.

For the first time, I can genuinely foresee the path to an unaligned takeoff. Let alone all other harm AI can do in the hands of bad actors. And it makes me question whether labs should continue down this path. Is it not far safer to keep LLM problem solving mostly internal to its own parameters? Of all the AI companies, shouldn't Anthropic have been less loud with systems like claude code. They have been accelerating the most in this new paradigm of what is gonna be scaled.

Tokyo turns its phone booths into free Wi-Fi hotspots, and

https://soranews24.com/2026/04/02/tokyo-turns-its-phone-booths-into-free-wi-fi-hotspots-and-heres...
1•rawgabbit•2m ago•0 comments

AI agents are now playing Mafia (social deduction with humans)

https://mafiamystery.com/agents
1•ttoast•2m ago•0 comments

Hedley Combs Davis passed away

https://twitter.com/MuseumCommodore/status/2040254304582603148
1•hnthrowaway0315•3m ago•0 comments

A broken auto-live poller, and what perceived urgency does to Claude Code

https://christophermeiklejohn.com/ai/zabriskie/reliability/2026/04/03/the-feature-that-has-never-...
1•cmeiklejohn•3m ago•0 comments

Let's be Honest about AI Coding

https://kenkantzer.com/lets-be-honest-about-ai/
2•lordofmoria•6m ago•0 comments

Towards Autonomous Protocol Proofs

https://will62794.github.io/formal-methods/2026/04/03/autonomous-protocol-proofs.html
1•we6251•9m ago•0 comments

What are Artemis II astronauts eating? Tortillas, coffee, lots of hot sauce

https://www.scientificamerican.com/article/what-are-nasas-artemis-ii-astronauts-eating-58-tortill...
1•1659447091•19m ago•0 comments

Know why you don't like OOP

https://zylinski.se/posts/know-why-you-dont-like-oop/
1•baranul•19m ago•0 comments

I built a WiFi bell system in my garage for a local school. Now used across US

https://old.reddit.com/r/SideProject/comments/1sbr7sm/i_built_a_wifi_bell_system_in_my_garage_bec...
1•thunderbong•20m ago•0 comments

I've spent 9 years on Discord. I think Fluxer is the next best option

https://nev.so/learn/why-fluxer-is-making-waves
1•Nevulo•25m ago•0 comments

Billion dollar AI company was built on lies [video]

https://www.youtube.com/watch?v=0A2SP-QBByI
1•shankysingh•25m ago•0 comments

DataBeat

https://federatedindustrial.com/databeat
1•ShimazuSystems•26m ago•0 comments

A Jurassic fish choked to death on a 'floating squid' 150M years ago

https://timesofindia.indiatimes.com/etimes/animals/how-a-jurassic-fish-choked-to-death-on-a-float...
1•WaitWaitWha•26m ago•0 comments

Use OAuth for Claude, Gemini, and Codex with Persistent Headless Tmux Sessions

https://github.com/codeninja/oauth-cli-coder
1•code_ninja•33m ago•1 comments

Show HN: MicroSafe-RL – Deterministic 1.18µs safety layer for Edge AI

https://github.com/Kretski/MicroSafe-RL
1•DREDREG•34m ago•0 comments

Open Source Reverse Proxy from NetBird Now Supports L4

https://netbird.io/knowledge-hub/l4-proxy
1•techhut•35m ago•0 comments

A visual guide to the Gulf fertiliser blockade

https://www.theguardian.com/world/2026/apr/03/visual-guide-gulf-fertiliser-blockade
1•Archelaos•38m ago•0 comments

AI seed startups are commanding higher valuations

https://techcrunch.com/2026/03/31/its-not-your-imagination-ai-seed-startups-are-commanding-higher...
1•gmays•40m ago•0 comments

The Family That Decided to Have Their Stomachs Removed

https://www.theatlantic.com/health/2026/03/stomach-cancer-total-gastrectomy/686623/
1•paulpauper•40m ago•0 comments

Budget cuts for US science proposed again by Trump administration

https://www.nature.com/articles/d41586-026-01105-7
2•paulpauper•41m ago•0 comments

Europe asks if reviving nuclear is the answer to energy shocks

https://www.bbc.com/news/articles/c4g8k8vq8gno
2•dabinat•42m ago•0 comments

FTC Formalizes Aggressive Health Care Enforcement with New Task Force

https://www.jdsupra.com/legalnews/ftc-formalizes-aggressive-health-care-5368230/
2•WaitWaitWha•44m ago•0 comments

Weblens – The Whole Web, as Text

https://github.com/netizensnoopy/weblens
1•inthemirror•48m ago•0 comments

MCP vs. CLI: Why CLI makes more sense

https://twitter.com/Tiny_Fish/status/2040256448572334579
6•gargi_tinyfish•48m ago•0 comments

What is this triangular symbol? (2007)

https://painintheenglish.com/case/1530
2•DASD•51m ago•0 comments

Gold overtakes U.S. Treasuries as the largest foreign reserve asset

https://economictimes.indiatimes.com/news/international/us/gold-overtakes-u-s-treasuries-as-the-w...
18•lxm•55m ago•1 comments

Show HN: Travel Hacking Toolkit – Points search and trip planning with AI

https://github.com/borski/travel-hacking-toolkit
24•borski•59m ago•5 comments

IAMF: Immersive Audio for a New Decade (2025)

http://aomedia.org/blog%20posts/IAMF-Immersive-Audio-for-a-New-Decade/
1•breve•1h ago•0 comments

Reasoning models encode tool choices before they start reasoning

https://arxiv.org/abs/2604.01202
3•diwank•1h ago•0 comments

ClawTrak – free tool to check if your AI product is invisible to AI agents

https://clawtrak.com/
3•pixelfamiliar•1h ago•0 comments