I've been curious whether this principle generalizes to today's agents.
So mehulkalia and I built Browser Brawl at the YC / BrowserUse hackathon last weekend and won first place. It is a fun experiment in which an attacker agent tries to complete tasks on live websites while a defender agent injects JavaScript to sabotage it.
The analogy isn't perfect, because browser tasks aren't zero-sum. But our hypothesis is that an agent faced with an adversary should produce more interesting training data than one navigating clean, static environments.
Try it on: http://browser-brawl.com
GitHub: https://github.com/RichardHruby/browser-brawl
Demo Video: https://youtu.be/NIoFXv-JvBY
(Skip to [0:55](https://www.youtube.com/watch?v=NIoFXv-JvBY&t=55s) to see the agents “brawling” in the arena :), [1:52](https://www.youtube.com/watch?v=NIoFXv-JvBY&t=1m52s) to see the browser traces generated)
Would love to chat with anyone building or training browser agents. Happy to dive in below!
SobjectiveTruth•1h ago