What if an AI could actually use the browser — not through brittle scripts, but by seeing the UI and deciding where to click?
In this video, I explain how modern agent tools work and demonstrate it live by letting a model *play Tic-Tac-Toe in the browser*. The agent takes screenshots, reasons about the UI, and interacts with the page step by step — very different from traditional WebDriver-style automation.
I use this demo to unpack:
- How AI-driven UI interaction works under the hood
- Why provider-native tools enable more reliable agent behavior
- How this differs from classic, deterministic browser automation
- Where this approach makes sense — and where it definitely doesn’t
If you’ve built browser automation, testing frameworks, or agentic systems, this should give you a concrete mental model for where things are heading.
cbromann•1h ago
In this video, I explain how modern agent tools work and demonstrate it live by letting a model *play Tic-Tac-Toe in the browser*. The agent takes screenshots, reasons about the UI, and interacts with the page step by step — very different from traditional WebDriver-style automation.
I use this demo to unpack: - How AI-driven UI interaction works under the hood - Why provider-native tools enable more reliable agent behavior - How this differs from classic, deterministic browser automation - Where this approach makes sense — and where it definitely doesn’t
If you’ve built browser automation, testing frameworks, or agentic systems, this should give you a concrete mental model for where things are heading.