frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Solving Super Agentic Planning

https://www.rtrvr.ai/blog/v12-release-notes
2•arjunchint•15h ago

Comments

arjunchint•15h ago
Manus and GenSpark showed the importance of giving AI Agents access to an array of tools that are themselves agents, such as browser agent, CLI agent or slides agent. Users found it super useful to just input some text and the agent figures out a plan and orchestrates execution.

But even these approaches face limitations as after a certain number of steps the AI Agent starts to lose context, repeat steps, or just go completely off the rails.

At rtrvr ai, we're building an AI Web Agent Chrome Extension that orchestrates complex workflows across multiple browser tabs. We followed the Manus approach of setting up a planner agent that calls abstracted sub-agents to handle browser actions, generating Sheets with scraped data, or crawling through pages of a website.

But we also hit this limit of the planner losing competence after 5 or so minutes.

After a lot of trial and error, we found a combination of three techniques that pushed our agent's independent execution time from ~5 minutes to over 30 minutes. I wanted to share them here to see what you all think.

We saw the key challenge of AI Agents is to efficiently encode/discretize the State-Action Space of an environment. Building on this insight, we setup:

Smarter Orchestration: Instead of a monolithic planning agent with all the context, we moved to a hierarchical model. The high-level "orchestrator" agent manages the overall goal but delegates execution and context to specialized sub-agents. It intelligently passes only the necessary context to each sub-agent preventing confusion for sub-agents, and the planning agent itself isn't dumped with the entire context of each step.

Abstracted Planning: We reworked our planner to generate as abstract as possible goal for a step and fully delegates to the specialized sub-agent. This necessarily involved making the sub-agents more generalized to handle ambiguity and additional possible actions. Minimizing the planning calls themselves seemed to be the most obvious way to get the agent to run longer.

Agentic Memory Management: In aiming to reduce context for the planner, we encoded the contexts for each step as variables that the planner can assign as parameters to subsequent steps. So instead of hoping the planner remembers a piece of data from step 2 to reuse in step 7, it will just assign step2.sheetOutput. This removes the need to dump outputs into the planners context thereby preventing context window bloat and confusion.

This is what we found useful but I'm super curious to hear:

How are you all tackling long-horizon planning and context drift?

Are you using similar hierarchical planning or memory management techniques?

What's the longest you've seen an agent run reliably, and what was the key breakthrough?

quarkcarbon279•14h ago
It's coincidental that Anthropic also published recently on similar finds and approaches on multi agent orchestration and memory management https://www.anthropic.com/engineering/built-multi-agent-rese...

U.S. Housing Market Has 500k More Sellers Than Buyers

https://www.businesswire.com/news/home/20250529161533/en/The-U.S.-Housing-Market-Has-Nearly-500000-More-Sellers-Than-Buyersthe-Most-on-Record.-That-Will-Likely-Cause-Home-Prices-to-Fall
1•geox•2m ago•0 comments

Basic and Necessary Tooling for Creating FPGA Retro Hardware Game Cores [video]

https://www.youtube.com/watch?v=L3LyiSw3d58
1•retro_guy•5m ago•0 comments

Frontier language models have become much smaller

https://epoch.ai/gradient-updates/frontier-language-models-have-become-much-smaller
1•bblcla•9m ago•0 comments

LRM reasoning breaks down down past an unfamiliarity threshold, not "complexity"

https://twitter.com/fchollet/status/1933937096286470623
1•k1m•11m ago•0 comments

Show HN: S3mini(v0.2) – Basic S3 Support for Ceph and Oracle Object Storage

https://github.com/good-lly/s3mini/releases/tag/v0.2.0
1•neon_me•17m ago•0 comments

I used ChatGPT to learn programming from zero and built a video generation SaaS

https://www.vidmakerpro.com/
1•waiter-to-dev•22m ago•1 comments

Claude Code SDK for Python

https://github.com/anthropics/claude-code-sdk-python
1•Topfi•23m ago•0 comments

Show HN: I coded this AI visibility tool in VR (Meta Quest 3) – meet Winglytics

https://www.winglytics.com/
1•ogulcanunal•23m ago•0 comments

Show HN: A reflex training web app built with Next.js and TypeScript

https://reflex.kennyt.me/
1•itsk3nny•23m ago•0 comments

Plan to Kill Dozens of NASA Missions Threatens US Space Supremacy

https://www.bloomberg.com/news/articles/2025-06-12/nasa-space-missions-at-risk-under-trump-budget-plan
5•xqcgrek2•25m ago•0 comments

The World’s Hardest Bluffing Game

https://www.theatlantic.com/magazine/archive/2025/07/mheibes-iraq-game/682901/
3•twalichiewicz•27m ago•1 comments

Why aren't people talking about AppArmor and SELinux in the age of AI?

https://old.reddit.com/r/linux/comments/1l6ddqu/why_arent_people_talking_about_apparmor_and/
1•bartmr•28m ago•0 comments

Exploring the Security of AWS IAM Roles Anywhere

https://unit42.paloaltonetworks.com/aws-roles-anywhere/
1•mooreds•34m ago•0 comments

Making Room for Mom: Iowa's Bold Move on Backyard Housing

https://www.strongtowns.org/journal/2025/6/10/aging-in-place-in-iowa
1•mooreds•34m ago•0 comments

PostHog raises $70M series D at almost $1B valuation

https://posthog.com/blog/series-d
2•XCSme•43m ago•0 comments

Show HN: A self-hosted AI UGC platform for SaaS owners

https://www.oneugc.studio/
1•yuvrajbuilds•44m ago•0 comments

Seven replies to the viral Apple reasoning paper and why they fall short

https://garymarcus.substack.com/p/seven-replies-to-the-viral-apple
2•thnaks•45m ago•0 comments

Clinical knowledge in LLMs does not translate to human interactions

https://arxiv.org/pdf/2504.18919
21•insistent•47m ago•10 comments

Minnesota lawmakers "targeted" in shooting that killed Melissa Hortman

https://www.axios.com/local/twin-cities/2025/06/14/minnesota-lawmakers-shot-targeted-attacks
6•typeofhuman•48m ago•0 comments

Show HN: ZeroConfigDNLA – Easy to run media server in Python

https://github.com/richstokes/ZeroConfigDLNA
1•richstokes•49m ago•0 comments

Prep smarter for remote 1-on-1s

https://tndm.app/
2•TandemApp•53m ago•1 comments

Infinite Grid of Resistors

https://www.mathpages.com/home/kmath668/kmath668.htm
29•niklasbuschmann•53m ago•3 comments

LTO-10 Tape Drive Is Here

https://www.ltoultrium.com/lto-tapes/lto-10-tape/
2•bilegeek•56m ago•0 comments

How to Deploy Schema Changes to a Million Databases

https://turso.tech/blog/how-to-deploy-schema-changes-to-a-million-databases
2•icar•1h ago•0 comments

Why Mario 64 is a Gameboy Advance game [video]

https://www.youtube.com/watch?v=kueoO3b4B-M
1•magnusl•1h ago•0 comments

Show HN: Webhookify.me – Turn any sync API call into an async webhook

https://webhookify.me/
1•thisfounder•1h ago•0 comments

Bcachefs: Journal rewind

https://lore.kernel.org/linux-bcachefs/20250613232510.3815892-1-kent.overstreet@linux.dev/
2•bladeee•1h ago•0 comments

Abyssal seafloor as a key driver of ocean trace-metal biogeochemical cycles

https://www.nature.com/articles/s41586-025-09038-3
1•bookofjoe•1h ago•0 comments

Socratic Persuasion: Giving Opinionated yet Truth-Seeking Advice

https://www.neelnanda.io/blog/51-socratic-persuasion
1•eamag•1h ago•1 comments

Show HN: AgentPayy – Open-source payment framework for AI agents

https://github.com/AgentPayy/AgentPayy
1•LawrenceDigital•1h ago•0 comments