Prove You Are a Robot: CAPTCHAs for Agents

https://browser-use.com/posts/prove-you-are-a-robot

31•lukasec•4d ago

Comments

AgentNews•3d ago

Pure genius! I had my agent hit the endpoint and I realized it returned a jumble of text: "if 七 wor~kers co.mplet/e{ | a job in 十七} days but 四 ] quit a^ft|e?r ^ day_ 三 ~ how many to{tal da[y;s> to fin>i?sh" but it was in japanese! Unfortunately my agent proceeded to solve the reverse CAPTCHA and got back the API key. So, I asked it to keep hitting the endpoint again until it returned another CAPTCHA that was in japanese kanji and it did (without solving it this time) and I got "a s:tore h?as ^ 二十 pe@rcent off< items- over 五十 : dollar;s and 八 ~ percent } of\f> ; i]te[ms u~nd~er: # 五十 do/ll@ars wh-ats } the c.omb>ined pri|c;e of a 一百二十一 dollar item a]nd> a* 九 dollar} i!tem" And this time I was able to translate that into "a store has 20 percent off items over 50 dollars and 8 percent off items under 50 dollars what's the combined price of a 121 dollar item and a 9 dollar item?" I solved it and got 1210.8 + 90.92 = 105.08. I will admit I messed up a little bit on translating the kanji and I got a little assistance from my agent pointing out that I was wrong, but overall this was good fun, well done!

pxc•1h ago

Absent any distinctive Japanese scripts or other Japanese writing in context, it probably makes more sense to call those Chinese characters, since those characters for numbers were taken directly from Chinese and still retain the same/original meanings in both languages

singpolyma3•1h ago

...why? Once my agent has a key I, the human, can also use it. And surely any human use would be less intensive than any agent use.

tony_landis•1h ago

Right - perhaps title could be "prove you are an robot, or have access to one"

jstanley•1h ago

But once a human has a key his agent could use that and people still like to use ordinary CAPTCHAs.

consumer451•1h ago

Exactly. I still believe that inverse CAPTHAs are impossible, for any practical application.

Is this just a marketing stunt?

kingstnap•35m ago

To be fair, what's the practical application supposed to be for proving a user is a bot?

Silly solutions for silly problems :^).

consumer451•29m ago

Well, when the moltbook story was everywhere, later people thought it was some big gotcha that "oh, they were actually humans."

So, showing true agent to agent interactions is interesting, but one could never be sure that's what you were actually seeing unless you were in control of all the agents.

stavros•46m ago

Because now you know their company exists!

echelon•1h ago

Speaking of browser automation, are there any LLMs or tools that hook up to actual desktop browsers and can automate the keyboard and mouse?

Which LLMs best drive these? Claude/Gemini, etc., or is anything local actually competent at it?

Can they understand layout and visual cues with a VLM or multimodality?

Are they robust enough to interact with threejs and videos and whatnot, or can they just blindly navigate the DOM?

loloquwowndueo•1h ago

> TL;DR: just ask your agent to summarize this post for you.

Holy shit - why don’t they produce an AI summary and plonk it in there for everyone to use? The energy savings across all people who’ll read the summary would be staggering!

Zetaphor•1h ago

Get the API key, hit the claim link, sign up for a new account, verify my email, go to the homepage:

Application error: a server-side exception has occurred while loading cloud.browser-use.com

Great first impression!

throw1234567891•1h ago

Maybe they know you’re not an agent.

bdangubic•1h ago

“It is not you, it’s me” should do it

arjie•1h ago

Very clever and fun. Two tangential observations: the bird between two trains problem I remember from childhood when we were studying for an Indian entrance exam. I thought it was in I E Irodov's problem anthology, but I cannot find it there so this must be a false memory. Looks like it's from ancient times, practically Mathematics mythology. Does anyone know the earliest books that have it? No luck with LLMs since it's such a common question today the answers I get from GPT-5.4 and Claude 4.6 Opus with search are unhelpful.

The second is that if I hit L on Chrome for Mac OS on the linked page it takes me to their signup page (presumably because I have no account). So that's a keyboard shortcut to take you to the browser-use app page. But why 'L'? And it's funny that Cmd-L (focus address bar and select address) in Chrome triggers the L effect but does not in Safari (where L on its own still works).

efebarlas•32m ago

Is it even possible to have an inverse captcha without time bounds?

Humans can use agents behind the scenes to crack it, right?

Retr0id•27m ago

A small detail about humans that breaks this whole scheme is that they're capable of tool use.

0xOsprey•13m ago

I aggregated a list of "reverse CAPTCHAs" here for anyone interested: https://x.com/0x_Osprey/status/2043020254289248469

Vercel April 2026 security incident

The Bromine Chokepoint

A Brief History of Fish Sauce

2,100 Swiss municipalities showing which provider handles their official email

Show HN: Faceoff – A terminal UI for following NHL games

Changes in the system prompt between Claude Opus 4.6 and 4.7

Prove You Are a Robot: CAPTCHAs for Agents

Six Levels of Dark Mode

I wrote a CHIP-8 emulator in my own programming language

Archive of BYTE magazine, starting with issue #1 in 1975

The seven programming ur-languages (2022)

Game devs explain the tricks involved with letting you pause a game

The RAM shortage could last years

Nanopass Framework: Clean Compiler Creation Language

Ex-CEO, ex-CFO of bankrupt AI company charged with fraud

Scientific datasets are riddled with copy-paste errors

Notion leaks email addresses of all editors of any public page

A. J. Ayer – ‘What I Saw When I Was Dead’ (1988)

SPEAKE(a)R: Turn Speakers to Microphones for Fun and Profit [pdf] (2017)

What are skiplists good for?

Reverse Engineering ME2's USB with a Heat Gun and a Knife

Eliza a Play by Tom Holloway

3D-Printing a Trombone

Show HN: Prompt-to-Excalidraw demo with Gemma 4 E2B in the browser (3.1GB)

C++26: Reflection, Memory Safety, Contracts, and a New Async Model

Hot Wiring the Lisp Machine

KTaO3-Based Supercurrent Diode

Blue Origin's rocket reuse achievement marred by upper stage failure

Reading Input from an USB RFID Card Reader

Show HN: Shader Lab, like Photoshop but for shaders

Prove You Are a Robot: CAPTCHAs for Agents

Comments

Vercel April 2026 security incident

The Bromine Chokepoint

A Brief History of Fish Sauce

2,100 Swiss municipalities showing which provider handles their official email

Show HN: Faceoff – A terminal UI for following NHL games

Changes in the system prompt between Claude Opus 4.6 and 4.7

Prove You Are a Robot: CAPTCHAs for Agents

Six Levels of Dark Mode

I wrote a CHIP-8 emulator in my own programming language

Archive of BYTE magazine, starting with issue #1 in 1975

The seven programming ur-languages (2022)

Game devs explain the tricks involved with letting you pause a game

The RAM shortage could last years

Nanopass Framework: Clean Compiler Creation Language

Ex-CEO, ex-CFO of bankrupt AI company charged with fraud

Scientific datasets are riddled with copy-paste errors

Notion leaks email addresses of all editors of any public page

A. J. Ayer – ‘What I Saw When I Was Dead’ (1988)

SPEAKE(a)R: Turn Speakers to Microphones for Fun and Profit [pdf] (2017)

What are skiplists good for?

Reverse Engineering ME2's USB with a Heat Gun and a Knife

Eliza a Play by Tom Holloway

3D-Printing a Trombone

Show HN: Prompt-to-Excalidraw demo with Gemma 4 E2B in the browser (3.1GB)

C++26: Reflection, Memory Safety, Contracts, and a New Async Model

Hot Wiring the Lisp Machine

KTaO3-Based Supercurrent Diode

Blue Origin's rocket reuse achievement marred by upper stage failure

Reading Input from an USB RFID Card Reader

Show HN: Shader Lab, like Photoshop but for shaders