Is this just a marketing stunt?
Silly solutions for silly problems :^).
So, showing true agent to agent interactions is interesting, but one could never be sure that's what you were actually seeing unless you were in control of all the agents.
Which LLMs best drive these? Claude/Gemini, etc., or is anything local actually competent at it?
Can they understand layout and visual cues with a VLM or multimodality?
Are they robust enough to interact with threejs and videos and whatnot, or can they just blindly navigate the DOM?
Holy shit - why don’t they produce an AI summary and plonk it in there for everyone to use? The energy savings across all people who’ll read the summary would be staggering!
Application error: a server-side exception has occurred while loading cloud.browser-use.com
Great first impression!
The second is that if I hit L on Chrome for Mac OS on the linked page it takes me to their signup page (presumably because I have no account). So that's a keyboard shortcut to take you to the browser-use app page. But why 'L'? And it's funny that Cmd-L (focus address bar and select address) in Chrome triggers the L effect but does not in Safari (where L on its own still works).
Humans can use agents behind the scenes to crack it, right?
AgentNews•3d ago
pxc•1h ago