frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Towards a science of scaling agent systems: When and why agent systems work

https://research.google/blog/towards-a-science-of-scaling-agent-systems-when-and-why-agent-systems-work/
18•gmays•3h ago

Comments

verdverm•1h ago
gonna read this with a grain of salt because I have been rather unimpressed with Google's Ai products, save direct API calls to gemini

The rest is trash they are forcing down our throats

4b11b4•1h ago
Yeah alpha go and zero were lame. The earth foundation model - that's just ridiculous.

That's sarcasm

---

Your "direct Gemini calls" is maybe the least impressive

edit: This paper is mostly a sort of "quantitative survey". Nothing to get too excited about requiring a grain of salt

verdverm•1h ago
The underlying models are impressive, be it Gemini (via direct API calls, vs the app or search), I would include alpha-go/fold/etc in that classification

The products they build, where the agentic stuff is, is what I find unimpressive. The quality is low, the UX is bad, they are forced into every product. Two notable examples, search in GCloud, gemini-cli, antigravity (not theirs technically, $2B whitelabel deal with windsurf iirc)

So yes, I see it as perfectly acceptable to be more skeptical of Google's take on agentic systems when I find their real world applications lackluster

4b11b4•49m ago
I agree with you in general re "agentic systems". Though they might deliberately not be trying to compete in the "agent harness" space yet.

The antigravity experiment yes was via windsurf - probably nobody expected that to take off but maybe was work that made have surfaced some lessons worth learning from.

verdverm•44m ago
My hunch is that Google is past it's prime, all the good PMs are gone, and now it looks like a chicken hydra with all the heads off and trying to run in multiple directs.

There is no clear vision, coherence, or confidence that the products will be around in a another year

nawgz•34m ago
Kind of a weird take given they are one of the strongest AI providers who are the most vertically integrated. Sure, maybe the company isn’t as healthy as it once was, but none of them are - late stage capitalism is rotting most foundations
CuriouslyC•1h ago
This is a neat idea but there are so many variables here that it's hard to make generalizations.

Empirically, a top level orchestrator that calls out to a planning committee, then generates a task-dag from the plan which gets orchestrated in parallel where possible is the thing I've seen put in the best results in various heterogeneous environments. As models evolve, crosstalk may become less of a liability.

zby•42m ago
Reasoning is recursive - you cannot isolate where is should be symbolic and where it should be llm based (fuzzy/neural). This is the idea that started https://github.com/zby/llm-do - there is also RLM: https://alexzhang13.github.io/blog/2025/rlm/ RLM is simpler - but my approach also have some advantages.
localghost3000•51m ago
I’ve been building a lot of agent workflows at my day job. Something that I’ve found a lot of success with when deciding on an orchestration strategy is to ask the agent what they recommend as part of the planning for phase. This technique of using the agent to help you improve its performance has been a game changer for me in leveraging this tech effectively. YMMV of course. I mostly use Claude code so who knows with the others.
detroitwebsites•12m ago
The "alignment principle" vs "sequential penalty" finding mirrors my production experience exactly.

I run a multi-agent system where specialized agents handle different business functions (customer support, code review, deployment monitoring). The key insight: task decomposability determines architecture.

Parallelizable tasks (analyzing independent customer tickets, running separate test suites) show massive gains with independent agents. Sequential workflows (debugging a specific issue that requires following a chain of logic) degrade with coordination overhead.

The "tool-use bottleneck" is real. We hit it around 12-15 tools per agent. The coordination tax becomes severe. Solution: role-based tool access. Support agents get 5 tools, deployment agents get 8, code review agents get 6. Overlap is minimal.

One counter-intuitive finding: persistent memory per agent beats centralized knowledge. Each agent has AGENTS.md (instructions), TOOLS.md (available actions), and memory/ directory (session logs). Agents learn from their own mistakes without polluting each other's context.

The error amplification metric (17.2x for independent vs 4.4x for centralized) explains why we use a hub-and-spoke model with human checkpoints at handoff boundaries.

Documented these patterns at howtoopenclawfordummies.com for anyone building similar systems.

Apple I Advertisement (1976)

http://apple1.chez.com/Apple1project/Gallery/Gallery.htm
133•janandonly•3h ago•96 comments

1-Click RCE to steal your Moltbot data and keys

https://depthfirst.com/post/1-click-rce-to-steal-your-moltbot-data-and-keys
35•arwt•1h ago•6 comments

Adventure Game Studio: OSS software for creating adventure games

https://www.adventuregamestudio.co.uk/
208•doener•7h ago•41 comments

Netbird – Open Source Zero Trust Networking

https://netbird.io/
594•l1am0•11h ago•224 comments

Efficient String Compression for Modern Database Systems

https://cedardb.com/blog/string_compression/
43•jandrewrogers•2d ago•1 comments

I taught my neighbor to keep the volume down

https://idiallo.com/blog/teaching-my-neighbor-to-keep-the-volume-down
273•firefoxd•2h ago•60 comments

Typechecking is undecidable when 'type' is a type (1989) [pdf]

https://dspace.mit.edu/bitstream/handle/1721.1/149366/MIT-LCS-TR-458.pdf?sequence=6
18•zem•2d ago•4 comments

TIL: Apple Broke Time Machine Again on Tahoe

https://taoofmac.com/space/til/2026/02/01/1630
86•rcarmo•1h ago•42 comments

MicroPythonOS graphical operating system delivers Android-like user experience

https://www.cnx-software.com/2026/01/29/micropythonos-graphical-operating-system-delivers-android...
142•mikece•3d ago•36 comments

Show HN: ÆTHRA – Writing Music as Code

41•CzaxTanmay•2d ago•11 comments

Clearspace (YC W23) Is Hiring an Applied Researcher (ML)

https://www.ycombinator.com/companies/clearspace/jobs/GOWiDwp-research-engineer-at-clearspace
1•anteloper•2h ago

Reliable 25 Gigabit Ethernet via Thunderbolt

https://kohlschuetter.github.io/blog/posts/2026/01/27/tb25/
162•kohlschuetter•5d ago•93 comments

What I learned building an opinionated and minimal coding agent

https://mariozechner.at/posts/2025-11-30-pi-coding-agent/
306•SatvikBeri•11h ago•131 comments

Towards a science of scaling agent systems: When and why agent systems work

https://research.google/blog/towards-a-science-of-scaling-agent-systems-when-and-why-agent-system...
18•gmays•3h ago•10 comments

Amiga Unix (Amix)

https://www.amigaunix.com/doku.php/home
90•donatj•10h ago•32 comments

FOSDEM 2026 – Open-Source Conference in Brussels – Day#1 Recap

https://gyptazy.com/blog/fosdem-2026-opensource-conference-brussels/
154•yannick2k•10h ago•90 comments

A Crisis comes to Wordle: Reusing old words

https://forkingmad.blog/wordle-crisis/
18•cyanbane•3h ago•17 comments

The Book of PF, 4th edition

https://nostarch.com/book-of-pf-4th-edition
180•0x54MUR41•13h ago•35 comments

English professors double down on requiring printed copies of readings

https://yaledailynews.com/articles/english-professors-double-down-on-requiring-printed-copies-of-...
75•cmsefton•5h ago•113 comments

Anciente map of Fairyland. Places from nursery rhymes, fairy tales etc.

https://collections.leventhalmap.org/search/commonwealth:3f463773q
43•speckx•5d ago•9 comments

VisualJJ – Jujutsu in Visual Studio Code

https://www.visualjj.com/
133•demail•4d ago•51 comments

Jack Kerouac's 37 metre-long, first draft scroll of On the Road to be auctioned

https://www.theguardian.com/books/2026/jan/30/jack-kerouac-on-the-road-first-draft-scroll-to-be-a...
44•mitchbob•2d ago•15 comments

List animals until failure

https://rose.systems/animalist/
301•l1n•20h ago•160 comments

Aging muscle stem cells shift from rapid repair to long-term survival

https://phys.org/news/2026-01-sprint-marathon-aging-muscle-stem.html
55•bikenaga•3h ago•13 comments

Light exposure and aspects of cognitive function in everyday life

https://www.nature.com/articles/s44271-025-00373-9
34•PaulHoule•2h ago•2 comments

A web server on a single floppy disk

http://floppy.ddns.net/
75•ActionRetro•3d ago•31 comments

The history of C# and TypeScript with Anders Hejlsberg [video]

https://www.youtube.com/watch?v=uMqx8NNT4xY
167•doppp•5d ago•128 comments

In praise of –dry-run

https://henrikwarne.com/2026/01/31/in-praise-of-dry-run/
275•ingve•1d ago•149 comments

Cells use 'bioelectricity' to coordinate and make group decisions

https://www.quantamagazine.org/cells-use-bioelectricity-to-coordinate-and-make-group-decisions-20...
161•marojejian•21h ago•73 comments

'Right-to-Compute' Laws May Be Coming to Your State This Year

https://www.vktr.com/ai-ethics-law-risk/right-to-compute-laws/
15•ohjeez•1h ago•10 comments