Show HN: Pilo – open-source agentic web automation engine by Mozilla

13•MrTravisB•2h ago

Hello HN,

We are the team behind Tabstack (https://tabstack.ai) - part of Mozilla. We just open sourced Pilo (pronounce PIE-low), the core engine that powers our automation platform. You can check it out on Github at https://github.com/mozilla/pilo.

Pilo is an agentic web automation library. Instead of writing rigid scripts with CSS selectors, you give it a natural language goal (e.g., "Find the best pizza in Seattle and extract the ratings") and it autonomously navigates the browser to achieve it.

We built this because we were struggling to make reliable agents for our own /automate endpoint. Existing tools were either too brittle (breaking on minor DOM changes) or too heavy (feeding raw HTML to LLMs, blowing up context windows).

Here is how Pilo solves those problems:

- Accessibility Tree over HTML: Instead of parsing raw HTML "soup," Pilo captures the browser's accessibility tree (via Playwright's _snapshotForAI). This gives the LLM a semantic, stable view of the page (buttons, links, inputs) rather than div hell.

- Context Compression: We pipe that tree through a compression engine. We map verbose tags (like listitem -> li), shorten reference IDs, and deduplicate repetitive text. This reduces token usage by 60-80% without losing interactive elements, allowing for much longer agent loops.

- Layered Error Handling: The web is flaky. Pilo treats navigation failures as distinct from interaction failures. It uses timeout escalation for network issues (doubling wait times) and will automatically restart the browser instance if it detects a "stuck" state or DNS failure.

- Agentic Loop: It follows a strict Plan -> Observe -> Act -> Validate loop. It even includes a separate validation step where a second LLM "grades" the final output against the original success criteria before returning it.

The "Cool" Part (Browser Extension) Since the core logic is decoupled from the runtime, we packaged it into a browser extension. You can install it, type a prompt, and literally watch the agent drive your local browser tab in real-time. It’s a great way to debug how the LLM "sees" the page.

Why Open Source? We sell the managed infrastructure (scaling browsers, persistent sessions, etc.) at Tabstack. But the execution engine itself, the thing that decides "click here" or "scroll there", should be open. You can run Pilo entirely on your own machine with your own API keys without paying us a dime.

You can read more about it on our blog https://tabstack.ai/blog/introducing-pilo-browser-automation.

Or check out the repo, install it, and give it a try - https://github.com/mozilla/pilo

We’d love to hear your feedback on the compression pipeline or how you’re handling agent state in your own projects.

Happy to answer any questions!

Comments

verdverm•1h ago

The main issue I see with everyone and their brother making specialized agentic frameworks is

1. I now have to understand N frameworks, their quirks and handles, their prompts and tools. I certainly don't want to be locked into their strict loop definition.

2. Most of them could be extensions, even just a skill, within other frameworks

I prefer to remain a minimalist for now and use projects like this for inspiration

MrTravisB•1h ago

We completely agree. Framework fatigue is real, and getting locked into a rigid loop is frustrating.

Choices are great, and our goal is to let you piece together a setup to your own liking. We want Pilo to work with your existing tools, not against them. If you just want to rip out our accessibility tree compression pipeline and use it as a standalone skill in your own custom framework, we consider that a massive win.

That is exactly why we are open sourcing it. We want to see what others can do with it.

If there is a framework or tool this could work with but does not currently, we would love to hear about it.

verdverm•19m ago

I use ADK which has many points 3rd parties plug in. I'm also involved in the development (from the outside). I will look more into Pilo and how this could work. Would save me a bunch of effort!

I'll open an issue for tracking

---

said issue: https://github.com/mozilla/pilo/issues/318

My lobster lost $450k this weekend

The Longest Line of Sight

Ductape – One SDK for any backend integration

You Can't Optimize What You Can't See. AI Cost Observability

Show HN: Fastdedup – Rust dataset deduplication (2:55 vs. 7:55 688MB vs. 22GB)

Hegseth gives Anthropic until Friday to back down on AI safeguards

Training my dog to vibe code B2B SaaS apps

Can agentic coding raise the quality bar?

Show HN: MakLock – Free macOS App Locker with Touch ID and Apple Watch

"SaaS is Dead" – they say

Show HN: YouAM – An address, contact card, and encrypted inbox for AI agents

Show HN: Shelfctl – PDF/ePub library manager backed by GitHub Release

Intel Formally Ends Four of Their Go Language Open-Source Projects

Spacydo: State machine example with own calldata for state transition rules

Data vs. Hype: How Orgs Win with AI – The Pragmatic Summit [video]

Implementing a Clear Room Z80 / ZX Spectrum Emulator with Claude Code

Coding Agent, Good?

Steel Bank Common Lisp

Forests don't just store carbon. They keep people alive, scientists say

The Deceptively Simple Act of Writing to Disk

Inception Launches Mercury 2, the Fastest Reasoning LLM

OpenAI, the US government and Persona built an identity surveillance machine

OpenAI resets spending expectations, from $1.4T to $600B

I think WebRTC is better than SSH-ing for connecting to Mac terminal from iPhone

China May Grab a Lead in the Race for Military Fusion

An AI agent bought from our WooCommerce store. Here's what we learned

Ask HN: Share a random link from your bookmarks

Ask HN: Demand for a compliance-first deterministic context compiler?

Ask HN: How to exhaustively search the scientific literature?

Gas Town, OpenClaw and the rise of open source AI agents