frontpage.

Show HN: Browser Harness – simplest way to give AI control of real browser

https://github.com/browser-use/browser-harness

5•gregpr07•2h ago

Hey HN,

We got tired of browser frameworks restricting the LLM, so we removed the framework and gave the LLM maximum freedom to do whatever it's trained on. We gave the harness the ability to self correct and add new tools if the LLM wants (is pre-trained on) that.

Our Browser Use library is tens of thousands of lines of deterministic heuristics wrapping Chrome (CDP websocket). Element extractors, click helpers, target managemenet (SUPER painful), watchdogs (crash handling, file downloads, alerts), cross origin iframes (if you want to click on an element you have to switch the target first, very anoying), etc.

Watchdogs specifically are extremely painful but required. If Chrome triggers for example a native file popup the agent is just completely stuck. So the two solutions are to: 1. code those heuristics and edge cases away 1 by 1 and prevent them 2. give LLM a tool to handle the edge case

As you can imagine - there are crazy amounts of heuristics like this so you eventually end up with A LOT of tools if you try to go for #2. So you have to make compromises and just code those heuristics away.

BUT if the LLM just "knows" CDP well enough to switch the targets when it encounters a cross origin iframe, dismiss the alert when it appears, write its own click helpers, or upload function, you suddenly don't have to worry about any of those edge cases.

Turns out LLMs know CDP pretty well these days. So we bitter pilled the harness. The concepts that should survive are: - something that holds and keeps CDP websocket alive (deamon) - extremely basic tools (helpers.py) - skill.md that explains how to use it

The new paradigm? SKILL.md + a few python helpers that need to have the ability to change on the fly.

One cool example: We forgot to implement upload_file function. Then mid-task the agent wants to upload a file so it grepped helpers.py, saw nothing, wrote the function itself using raw DOM.setFileInputFiles (which we only noticed that later in a git diff). This was a relly magical moment of how powerful LLMs have become.

Compared to other approaches (Playwright MCP, browser use CLI, agent-browser, chrome devtools MCP): all of them wrap Chrome in a set of predefined functions for the LLM. The worst failure mode is silent. The LLM's click() returns fine so the LLM thinks it clicked, but on this particular site nothing actually happened. It moves on with a broken model of the world. Browser Harness gives the LLM maximum freedom and perfect context for HOW the tools actually work.

Here are a few crazy examples of what browser harness can do: - plays stockfish https://x.com/shawn_pana/status/2046457374467379347 - sets a world record in tetris https://x.com/shawn_pana/status/2047120626994012442 - figures out how to draw a heart with js https://x.com/mamagnus00/status/2046486159992480198?s=20

You can super easily install it by telling claude code: `Set up https://github.com/browser-use/browser-harness for me.`

Repo: https://github.com/browser-use/browser-harness

What would you call this new paradigm? A dialect?

Show HN: How LLMs Work – Interactive visual guide based on Karpathy's lecture

Show HN: Gova – The declarative GUI framework for Go

Show HN: Atomic – Local-first, AI-augmented personal knowledge base

Show HN: leaf – a terminal Markdown previewer with a GUI-like experience

Show HN: Tolaria – Open-source macOS app to manage Markdown knowledge bases

Show HN: Browser Harness – simplest way to give AI control of real browser

Show HN: Agent Vault – Open-source credential proxy and vault for agents

Show HN: Honker – Postgres NOTIFY/LISTEN Semantics for SQLite

Show HN: Headless terminal - Allow agents to run any interactive TUI or CLI

Show HN: Learn conflict resolution through a 90-second interactive story

Show HN: Broccoli, one shot coding agent on the cloud

Show HN: Safer – Sleep better while AI agents have shell access

Show HN: GoModel – an open-source AI gateway in Go

Show HN: AgentSearch – Self-hosted search and MCP for AI agents, no API keys

Show HN: LocalLLM – Recipes for Running the Local LLM (Need Contributors)

Show HN: easl – Instant hosting for AI agents

Show HN: Pdfnative – zero-dependency TypeScript PDF engine

Show HN: RustNmap

Show HN: Run coding agents in microVM sandboxes instead of your host machine

Show HN: Mediator.ai – Using Nash bargaining and LLMs to systematize fairness

Show HN: A Clean Room RFC for NTFS Structural Repair

Show HN: VidStudio, a browser based video editor that doesn't upload your files

Show HN: Ctx – a /resume that works across Claude Code and Codex

Show HN: SQL Protocol – learn SQL by running real queries, with 1v1 PvP

Show HN: Algorithmic String Art, accessible to all

Show HN: Daemons – we pivoted from building agents to cleaning up after them

Show HN: Stash – CLI to search over your team's coding agent sessions

Show HN: Endo Familiar, an O-cap based JavaScript agent sandbox

Show HN: Real-Real-Time Chat

Show HN: Tron Hilbert Curve Macro

Show HN: Browser Harness – simplest way to give AI control of real browser

Show HN: How LLMs Work – Interactive visual guide based on Karpathy's lecture

Show HN: Gova – The declarative GUI framework for Go

Show HN: Atomic – Local-first, AI-augmented personal knowledge base

Show HN: leaf – a terminal Markdown previewer with a GUI-like experience

Show HN: Tolaria – Open-source macOS app to manage Markdown knowledge bases

Show HN: Browser Harness – simplest way to give AI control of real browser

Show HN: Agent Vault – Open-source credential proxy and vault for agents

Show HN: Honker – Postgres NOTIFY/LISTEN Semantics for SQLite

Show HN: Headless terminal - Allow agents to run any interactive TUI or CLI

Show HN: Learn conflict resolution through a 90-second interactive story

Show HN: Broccoli, one shot coding agent on the cloud

Show HN: Safer – Sleep better while AI agents have shell access

Show HN: GoModel – an open-source AI gateway in Go

Show HN: AgentSearch – Self-hosted search and MCP for AI agents, no API keys

Show HN: LocalLLM – Recipes for Running the Local LLM (Need Contributors)

Show HN: easl – Instant hosting for AI agents

Show HN: Pdfnative – zero-dependency TypeScript PDF engine

Show HN: RustNmap

Show HN: Run coding agents in microVM sandboxes instead of your host machine

Show HN: Mediator.ai – Using Nash bargaining and LLMs to systematize fairness

Show HN: A Clean Room RFC for NTFS Structural Repair

Show HN: VidStudio, a browser based video editor that doesn't upload your files

Show HN: Ctx – a /resume that works across Claude Code and Codex

Show HN: SQL Protocol – learn SQL by running real queries, with 1v1 PvP

Show HN: Algorithmic String Art, accessible to all

Show HN: Daemons – we pivoted from building agents to cleaning up after them

Show HN: Stash – CLI to search over your team's coding agent sessions

Show HN: Endo Familiar, an O-cap based JavaScript agent sandbox

Show HN: Real-Real-Time Chat

Show HN: Tron Hilbert Curve Macro