Hi HN, I’m one of the founders at Browser Use.
This is the first benchmark we’re releasing that combines multiple public browser benchmarks into what we think is the closest approximation of what people actually want from browser agents.
It focuses on hard but solvable real-world tasks. We’ve run thousands of internal evaluations to validate the benchmark.
We plan to extend this soon with new benchmarks based on real tasks shared by power users. Feedback welcome.
MagMueller•1h ago