Show HN: Deterministic browser control for AI agents (~90% on Mind2Web)

https://github.com/theredsix/agent-browser-protocol

12•theredsix•8h ago

Comments

theredsix•8h ago

Hi HN, op here! This is an open source browser protocol for LLM agents.

The browser shows the model the current page, the model chooses the next action, and the browser returns the new state. Between steps, JavaScript and time are frozen so the page stays still while the model thinks.

That makes things like ecommerce shopping and popup-heavy web app workflows much more reliable.

Using this setup, the project gets ~90% on Online Mind2Web. My bet is that browser agents need a protocol designed for models, not just wrappers around CDP.

bignoggins•7h ago

what do you do differently compared to other options?

theredsix•7h ago

The difference is that we make browser use turn-based and return a single structured result per action.

With most other tools, the model is interacting with a live browser and effectively has to reason through a stream of low-level events while the page keeps changing. We instead freeze the page, let the model request one action, execute it, allow all resulting browser events to play out, then freeze again and return one bundled response with everything that happened plus the new stable page state.

So the model isn’t chasing a moving UI or event stream. It gets one grounded step at a time. A big part of the performance gain seems to come from that holistic action envelope.

shane-moran•7h ago

This is a great example of breaking the trend thinking it’s always model improvement needed, but sometimes the model doesn’t have the best way to interface with the data or system.

The cleanliness of this approach that improves the ability for the model to interact without having to completely redefine the interface system with respect to still being able to use websites and the computer as is and not having to develop an entirely new interface protocol at the machine level.

theredsix•7h ago

Exactly, the harness or protocol can matter just as much!

Thors3n•5h ago

Very exciting stuff! Most agent browser stacks still feel clunky to me. This is very promising, turning browsing into deterministic, atomic steps should definitely improve user interaction and E2E utility.

greggberry•5h ago

This project is incredible!

I already have it set up for local Claude agent use and seeing significant improvement, both in accuracy and task efficiency: `claude mcp add browser -- npx -y agent-browser-protocol@rc --mcp`

Additionally, if you want to configure with Claude Desktop, add the following to your `claude_desktop_config.json` after installing the MCP:

``` "mcpServers": { "browser": { "command": "npx", "args": [ "-y", "agent-browser-protocol@rc", "--mcp" ] } } ```

Russian Ransomware Administrator Pleads Guilty to Wire Fraud Conspiracy

Show HN: Rust-First L3 Limit Order Book Backtesting Engine with Python Bindings

Show HN: Ovumcy – self-hosted menstrual cycle tracker

Show HN: Sheila, an AI agent that replaced our accounting flow

Qualcomm CEO: 'Resistance Is Futile' as 6G Mobile Revolution Approaches

Show HN: NeoNetrek – modernizing the internet's first team game (1988)

Show HN: Natural language queries for Prometheus Kafka metrics (StreamLens)

Satellite firm pauses imagery after revealing Iran's attacks on US bases

China Suspected in Breach of FBI Surveillance Network

Show HN: I created list of directories (1000) to create free backlinks

Fishing crews in the Atlantic keep accidentally dredging up chemical weapons

The National Videogame Museum Has Acquired the Mythical Nintendo PlayStation

C# Strings Silently Kill Your SQL Server Indexes in Dapper

Show HN: I open-sourced my Steam game, 100% written in Lua, engine is also open

The White House: Touchdown

Capability-Tiered AI Governance Architecture (CEGP)

A new chapter for the Nix language, courtesy of WebAssembly

Shipping a Button in 2026 [video]

Show HN: Stream-native AI that never sleeps, an alternative to OpenClaw

Show HN: Flompt – Visual prompt builder that decomposes prompts into blocks

FBI investigating 'suspicious' cyber activity on system holding wiretaps

Show HN: key-carousel - Key rotation for LLM agents

Device that can extract 1k liters of clean water a day from desert air

Show HN: Sqry – semantic code search using AST and call graphs

The Window Chrome of Our Discontent

When Batteries Heat Up, This Membrane "Sweats" It Out

Show HN: Stratum - a pure JVM columnar SQL engine using the Java Vector API

Wild crows in Sweden help clean up cigarette butts

Show HN: BLOBs in MariaDB's Memory Engine – No More Disk Spills for Temp Tables

Tip me, my life depends on it (2021)