At a high level, the agent generates and maintains userscripts and CSS that are re-applied on page load. Rather than just editing DOM via JS in console the agent is treating the page, and the DOM as a file.
The models are often trained in RL sandboxes with full access to the filesystem and bash, so they are really good at using it. So to make the agent behave well, I've simulated this environment.
The whole state of a page and scripts is implemented as a virtual filesystem hacked on top of browser.local storage. URL is mapped to directories, and the agent starts inside this directory. It has the tools to read/edit files, grep around and a fake bash command that is just used for running scripts and executing JS code.
I've tested only with Opus 4.5 so far, and it works pretty reliably. The state of the file system can be synced to the real filesystem, although because Firefox doesn't support Filesystem API, you need to manually import the fs contents first.
This agent is really useful for extracting things to CSV, but it's also can be used for fun.
Esophagus4•2h ago
That’s one of my biggest headaches writing user scripts currently: I write the script in an IDE with Claude then copy it to the browser / manually test it in the browser, then copy the results back to Claude or tell it what went wrong.
Looking forward to trying this.
mifydev•1h ago
Zekio•58m ago