Lightweight open source reCaptcha alternative

113•michalpleban•9mo ago

Comments

rahimnathwani•9mo ago

Is there a live demo anywhere? I didn't see one linked in the README or on the official site's home page.

unixfox•9mo ago

Yes found one: https://altcha.org/captcha/

immibis•8mo ago

The purpose of reCaptcha is to enhance your Google user profile and to deny legitimate users. How does this alternative accomplish those things?

This appears to be a proof-of-work, like Anubis. Real captchas collect much more fingerprinting data to ensure that only users with the latest version of Chrome, the latest version of Windows, and an Nvidia graphics card, can use the site.

literalAardvark•8mo ago

Yeah, this fails at its most important task, making those filthy, dirty, Firefox users click on bridges for 1 hour a day.

On topic though, how does this improve on hCaptcha?

akimbostrawman•8mo ago

Don't forget those criminals with non-residential IPs

Semaphor•8mo ago

> On topic though, how does this improve on hCaptcha?

Cloud vs self-hosted, click annoying things challenge vs automatic proof of work. Or are there other hCaptcha versions and I just never realized it?

Imustaskforhelp•8mo ago

oh yes. as a firefox (now librewolf) user, it deeply saddens me.

g-b-r•8mo ago

I know some people who quickly give up and renounce using the service, when they run into hCaptcha puzzles.

I've been bewildered for some time as well, honestly, it took me a while to figure out the first I ran into.

And trying one now, fully knowing that I'd have to solve one, I was dumbfounded by the puzzle I've gotten, it took me a few seconds to understand it.

Cloudflare's ones are horrible and a plague (although they might have slightly improved recently), but I'm not certain I'd prefer hCaptchas over them.

Semaphor•8mo ago

I know this is partially a joke, but I’d like to mention that as a Firefox user with uBo and uMatrix, I almost never have to solve challenges with ReCaptcha.

hexagonwin•8mo ago

how? do you just allow all cookies/scripts/xhr on your umatrix? i'm on a similar config and I get captchas far often than any other users on the same network for some reason.

Semaphor•8mo ago

I use the uMatrix plugin to automatically allow what’s required for ReCaptcha, that tends to work. I do get the annoying picker sometimes and have an AI extension to solve it for me, but it’s relatively rare, like 1 out of 5 times max (I generally don’t see RC that often).

No idea how I’d compare to others on my Network, that’d only be my wife and as a Linux user she’d probably get more than me with Windows ;)

pabs3•8mo ago

I thought uMatrix got abandoned?

Semaphor•8mo ago

Yeah (though when it broke thanks to an ff update, a fix was released and then it was archived again). But it still works, and I use it because it has a vastly better interface for domain/type blocking than uBo.

pabs3•8mo ago

Got a link to that plugin btw?

Semaphor•8mo ago

It’s integrated in uMatrix. I’s the puzzle symbol. You can only activate it, if uMatrix already detects recaptcha.

piva00•8mo ago

On the other hand as a user of Firefox I simply cannot pass Cloudflare's verification at all, I always end up in a loop. It's been like that for more than a year... Sometimes it does work on a private window, no idea why or how since I have the same extensions enabled.

xena•8mo ago

Do you store cookies?

serhack_•8mo ago

The real secret of an effective captcha-like system is to identify/collect lots of data, identify suspicious patterns, validate them (checking what kind of data exposes a bot-like system) and then use this for serving dynamic challenges based on a couple of information.

Example: if the system identifies the user as a bot, it tries to give a less performant solution in terms of PoW.

Imustaskforhelp•8mo ago

Maybe somebody could explain me why your comment is in different contrast of grey?

I think somebody might have flagged your comment, but it is a real fact.

This is one of the reasons why people say cloudflare owns the majority of internet but I think I am okay with that since cloudflare is pretty chill. And they provide the best services but still it just shows that the internet isn't that decentralized.

But google captcha is literally tracking you IIRC, I would personally prefer hcaptcha if you want centralized solution or anubis if you want to self host (I Prefer anubis I guess)

ArinaS•8mo ago

Cloudflare is not chill because they, either ignorantly or purposefully, block everything that's not Chromium or Firefox[1].

Or sometimes everything that's not just Chromium[2].

[1] - https://www.theregister.com/2025/03/04/cloudflare_blocking_n...

[2] - https://www.techradar.com/pro/cloudflare-admits-security-too...

ronsor•8mo ago

Don't worry. They sometimes block Chromium too.

Zak•8mo ago

> Maybe somebody could explain me why your comment is in different contrast of grey?

Downvotes. Comments with negative scores are shown with lower contrast. The more negative the score, the less contrast they get.

Jleagle•8mo ago

Can someone explain why a robot would not be able to calculate the PoW?

jsheard•8mo ago

It could, the idea is just to tip the economics such that it's not worth it for the bot operator. That kind of abuse typically happens at a vast scale where the cost of solving the challenges adds up fast.

hombre_fatal•8mo ago

Botnets don't even use their own hardware.

Why would someone renting dirt cheap botnet time care if the requests take a few seconds longer to your site?

Plus, the requests are still getting through after waiting a few seconds, so it does nothing for the website operator and just burns battery for legit users.

jsheard•8mo ago

Botnets just shift the bottleneck from "how much compute can they afford to buy legit" to "how many machines can they compromise or afford to buy on the black market". Either way it's a finite resource, so making each abusive request >10,000x more expensive still severely limits how much damage they can do, especially when a lot of botnet nodes are IoT junk with barely any CPU power to speak of.

victorbjorklund•8mo ago

There is still an opportunity cost. They can scrape just your site or they can scrape 100 other sites without POW (no idea if it is 10, 100 etc)

Jleagle•8mo ago

So it's the same as a sleep()

ahofmann•8mo ago

No, because the bot can just also sleep and scrape other sources in that time. With pow, you waste their CPU cycles and block them from doing other work.

hombre_fatal•8mo ago

Websites aren't really fungible like that, and where they are (like general search indexing for example), that's usually the least hostile sort of automated traffic. But if that's all you care about, I'll cede the point.

Usually if you're going to go through the trouble of integrating a captcha, you want to protect against targeted attacks like a forum spammer where you don't want to let the abusive requests through at all, not just let it through after 5000ms.

victorbjorklund•8mo ago

One page with POW vs another identical one without POW where everything else is the same.

bityard•8mo ago

If you're a botnet operator of a botnet that normally scraped a few dozen pages per second and then noticed a site suddenly taking multiple seconds per page, that's at least an order of magnitude (or two) decrease in performance. If you care at all about your efficiency, you step in and put that site on your blacklist.

Even if the bot owner doesn't watch (or care) about about their crawling metrics, at least the botnot is not DDoSing the site in the meantime.

This is essentially a client-side tarpit, which are actually pretty effective against all forms of bot traffic while not impacting legitimate users very much if at all.

remram•8mo ago

A tarpit is selective. You throw bad clients in the tarpit.

This is something you throw everyone through. both your abusive clients (running on stolen or datacenter hardware) and your real clients (running on battery-powered laptops and phones). More like a tar-checkpoint.

jrochkind1•8mo ago

That's definitely the idea.

So the crazy decentralized mystery botnet(s) that are affecting many of us -- don't seem to be that worried about cost. They are making millions of duplicate requests for duplicate useless content, it's pretty wild.

On the other hand, they ALSO dont' seem to be running user-agents that execute javascript.

This is in the findings of a group of some of my colleagues at peer non-profits that have been sharing notes to try to understand what's going on.

So the fact that they don't run JS at present means that PoW would stop them -- but so would something much simpler and cheaper relying on JS.

If this becomes popular, could they afford to run JS and to calcualte the PoW?

It's really unclear. The behavior of these things does not make sense to me enough to have much of a theory about what their cost/benefits or budgets are, it's all a mystery to me.

Definitely hoping someone manages to figure out who's really behind this and why at some point. (i am definitely not assuming it's a single entity either).

at0mic22•8mo ago

It's not exactly true. You don't need to solve the challenge per each request as PoW systems provide you with a session token which is valid for a while.

Basically you need session-token generators which usually are automated headless browsers.

Another not-exactly-valid point is you don't need a botnet. You can scrape at scale with 1 machine using proxies. Proxies are dirt cheap.

So basically you generate a session for a proxy IP and scrape as long as the token is valid. No botnets, no magic, nada. Just business.

dpassens•8mo ago

I think the general idea isn't that they can't but that they either won't, because they're not executing JS, or that it would slow them down enough to effectively cripple them.

jrochkind1•8mo ago

As long as their not executing JS, they don't really need a PoW to stop them, though. Something much simpler that requires executing JS would do.

i might at any rate set my PoW to be relatively cheap, which would do for anyone not executing JS.

diggan•8mo ago

I think this being called a "recaptcha alternative" to be slightly misleading.

There are two problems some website hosters encounter:

A) How do I ensure no one DDOS (real or inadvertently) me?

B) How can I ensure this client is actually a human, not a robot?

Things like ReCaptcha aimed to solve B, not A. But the submitted solution seems to be more for A, as calculating a PoW can be (probably must be actually) calculated by a machine, not a human. While ReCaptcha is supposed to be the opposite, could only be solved by a human.

progx•8mo ago

In AI century, how you would detect a real person or an AI?

ArinaS•8mo ago

This thing, despite using "captcha" in its name, is not your typical captcha like hCaptcha or Google's one, because it uses a proof-of-work mechanism instead of writing answers in textboxes/clicking on images/other means of verification requiring user input.

AI bots can't solve proof-of-work challenges because browsers they use for scraping don't support features needed to solve them. This is highlighted by existence of other proof-of-work solutions designed to specifically filter out AI bots, like go-away[1] or Anubis[2].

And yes, they work - once GNOME deployed one of these proof-of-work challenges on their gitlab instance, traffic on it fell by 97%[3].

[1] - https://git.gammaspectra.live/git/go-away

[2] - https://github.com/TecharoHQ/anubis

[3] - https://thelibre.news/foss-infrastructure-is-under-attack-by...: "According to Bart Piotrowski, in around two hours and a half they received 81k total requests, and out of those only 3% passed Anubi's proof of work, hinting at 97% of the traffic being bots."

graemep•8mo ago

> AI bots can't solve proof-of-work challenges because browsers they use for scraping don't support features needed to solve them.

At least sometimes. I do not know about AI scraping but there are plenty of scraping solutions that do run JS.

It also puts of some genuine users like me who prefer to keep JS off.

The 97% is only accurate if you assume a zero false positive rate.

ArinaS•8mo ago

> "It also puts of some genuine users like me who prefer to keep JS off."

Non-javascript challenges are also available[1].

> "The 97% is only accurate if you assume a zero false positive rate."

GNOME's gitlab instance is not something people visit daily like Wikipedia, so it's a negligible amount of false positives.

[1] - https://git.gammaspectra.live/git/go-away/wiki/Challenges#no...

graemep•8mo ago

> Non-javascript challenges are also available

Did not know that. Good news

> NOME's gitlab instance is not something people visit daily like Wikipedia, so it's a negligible amount of false positives.

As an absolute number, yes, but as a proportion?

diggan•8mo ago

> AI bots can't solve proof-of-work challenges because browsers they use for scraping don't support features needed to solve them. This is highlighted by existence of other proof-of-work solutions designed to specifically filter out AI bots, like go-away[1] or Anubis[2].

Huh, they definitely can?

go-away and Anubis reduces the load on your servers as bot operators cannot just scrape N pages per second without any drawbacks. Instead it gets really expensive to make 1000s of requests, as they're all really slow.

But for a user who uses their own AI agent, that browses the web, things like anubis and go-away aren't meant to (nor does it) stop them from accessing the websites at all, it'll just be a tiny bit slower.

Those tools are meant to stop site-wide scraping, not individual automatic user-agents.

Jleagle•8mo ago

AI's scrape data from web pages just like anything else does. I don't think their existence makes a difference.

immibis•8mo ago

AIs don't. AI companies do.

Well, maybe. As far as I can see, the overt ones are using pretty reasonable rate limits, even though they're scraping in useless ways (every combination of git hash and file path on gitea). Rather, it seems like he anonymous ones are the problem - and since they're anonymous, we have zero reason to believe they're AI companies. Some of them are running on Huawei Cloud. I doubt OpenAI is using Huawei Cloud.

dvh•8mo ago

Certainly! Distinguishing between a real person and an AI in the AI century can be tricky, but some key signs include emotional depth, unpredictable creativity, personal experiences, and complex human intuition. AI, on the other hand, tends to rely on data patterns, structured reasoning, and lacks genuine lived experiences.

igorbark•8mo ago

i enjoy that i cannot tell whether this is written by an AI, or by a human pretending to be an AI. my guess is human pretender!

chrismorgan•8mo ago

CAPTCHA stood for “Completely Automated Public Turing Test to tell Computers and Humans Apart”.

By this point, it’s obvious that that has failed, and even that no general solution is possible any more.

ALTCHA… telling Computers and Humans Apart? No, this is proof of work, meaning it’s just about making things expensive—abuse control, not actually distinguishing between computers and humans.

In fact, in https://altcha.org/captcha/ one of the headings is Inclusive to Robots! This is so far the opposite of traditional CAPTCHA, on the technical side, that it’s mildly hilarious. (Socially, they largely amount to the same thing—people never did actually care about computers, just abusive bots.)

Then the question is: what is the proof of work mechanism? How robust are things going to be, and can you ensure attacking will remain expensive, without burdening users too much?

https://altcha.org/docs/proof-of-work/ indicates it’s SHA hashing, not something like scrypt. Uh oh. The best specialised hardware is several million times as good as good laptops¹, let alone cheap phones. If this were to become popular, bots would switch to such hardware, probably making the cost of attacking practically negligible. https://altcha.org/docs/complexity/ shows they’ve thought about these things, but I feel that although it will work for a while, it’s ultimately a doomed game. And in the mean time, you can normally go waaaay simpler and less intrusive: most bots are extremely dumb.

Is “captcha” heading in the direction of meaning “bad rate limiting”?

Because really that’s what this stuff is: rate limiting that trusts that clients don’t have lots of compute power conveniently available, but will get vaporised by powerful and intentional adversaries.

—⁂—

¹ On the https://altcha.org/docs/complexity/ test, a comparatively ideal browser on my 5800HS laptop might reach 500,000 SHA-256 hashes per second at a cost of at least 25W. (Chromium gets half this with ~50% CPU usage; Firefox one tenth, altogether failing to load the cores for some reason.) The most energy-efficient commercial Bitcoin miners seem to be doing around 80 billion of these hashes per watt-second. That’s four million times as good. You cannot bridge such a divide.

immibis•8mo ago

Welcome to the post-competence era. Words don't mean things. https://news.ycombinator.com/context?id=43998547

binary132•8mo ago

It’s crazy how much of the internet and our app stacks depend on proprietary hosted service integrations that will almost certainly disappear or break in time. Sure it’s convenient to get off the ground with but it doesn’t make sense to me to gate your functionality on a third party that can easily break or slip out from under you. It would be one thing if proprietary software was distributed in a form you could keep operating and using on your own, but even that is obviously inferior to being able to “repair your own equipment”.

IgorPartola•8mo ago

reCaptcha is routinely broken for me. Almost every time I see it I have to solve it about a dozen times, then it decides I’m not human. After 2-3 page refreshes it does let up but it’s frustrating as hell.

wkat4242•8mo ago

Are you on Linux by any chance? For some reason this is now deemed 'suspicious' by recaptcha and cloudflare :( Especially if you use Firefox. It's driving me crazy getting bombarded by these.

worldsavior•8mo ago

Did you try faking user agent?

wkat4242•8mo ago

Yes but that made it worse.

In fact I used to fake user agent all the time because Microsoft 365 is so retarded. With the Firefox/Linux user agent a lot of features don't work. When it pretends to be MS Edge it works fine. Clearly trying to force people to use the 'invented here' browser :(

But as I was getting captcha's I moved to using it only for the MS365 sites and nowhere else. It seems to have reduced the captcha's somewhat, especially the ones that never end (keep looping). But I still get a ton of "Your browser is suspicious, here's an extra check" nonsense from Cloudflare in particular.

godzillabrennus•8mo ago

In the startup world it is a huge economic advantage if you can prototype an idea in days that would have taken months or years. The tradeoffs are acquiring technical debt but we seem capable of resolving that after the concept has found product market fit.

graemep•8mo ago

Yes, but its not just startups and people do not seem to actually resolve it.

Lots of big businesses use recaptcha. Quite often unnecessarily. If I need to login with 2FA touse a service does it really need recaptcha?

Similarly, cloudflare sends you emails telling you how many bots and attacks it has stopped - but you do not know how many false positives there were.

theappsecguy•8mo ago

Yes you still need recaptcha simply to avoid password stuffing attacks.

damsalor•8mo ago

Certainly not in the mentioned 2fa scenario.

I would guess that simple rate limiting would do the trick for the rest

Zak•8mo ago

Rate limiting does not solve this problem because botnets often don't make repeated requests from the same IP address. 2FA does solve it.

dsr_•8mo ago

Citation, as they say, is needed.

As far as I can tell, most startups resolve their technical debt by failing, and the majority of the rest resolve their debt by being acquired by a company which replaces the original service entirely in 1-3 years because it's too hard to integrate as-is.

binary132•8mo ago

Yes, and I certainly was not saying startups should roll their own fraud prevention

palmotea•8mo ago

> It’s crazy how much of the internet and our app stacks depend on proprietary hosted service integrations that will almost certainly disappear or break in time. Sure it’s convenient to get off the ground with but it doesn’t make sense to me to gate your functionality on a third party that can easily break or slip out from under you.

At least with captchas, it's somewhat understandable with the arms-race aspect. The third party does the work of engaging in the arms race, so you don't have to, but the tradeoff is what you describe.

rendx•8mo ago

Not only that, but it's also totally acceptable now to broadcast your user's data to a megaton of external services for no good reason. If people had some grasp of what is going on and it was visible to them, they would complain very loudly about it in your face.

pabs3•8mo ago

Hmm, how do they know you have calculated the PoW without setting a cookie? Or do you have to calculate it on every page load?

jrochkind1•8mo ago

yeah, I need more info to understand what's up.

Maybe it's only used on individual form submit (like the classic captcha use-case), and not on a page load, and it does have to be recalculated on every form submit?

alamsterdam•8mo ago

Yes, I was wondering what is to stop you replaying the same PoW multiple times. All I can find is:

To prevent the vulnerability of “replay attacks,” where a client resubmits the same solution multiple times, the server should implement measures that invalidate previously solved challenges.

The server should maintain a registry of solved challenges and reject any submissions that attempt to reuse a challenge that has already been successfully solved.

This doesn't seem very scaleable? Or am I missing something?

dankobgd•8mo ago

recaptcha is useless, only annoys actual users. I lost 15 minutes last week on miui site with their trash recaptcha. The point is to steal more data from you

remram•8mo ago

Why call it a CAPTCHA if it is not even trying to tell Computers and Humans Apart (CHA)?

This is only trying to tell human browsers from bot browsers apart. Not even that, it seems all it does is slow all browsers down equally.

immibis•8mo ago

Because <s>human</s> western society is in its post-competence era. It doesn't matter whether you can do your job, only whether your manager thinks you are, and they don't understand your job so they use all the wrong metrics.

Like whether there's a checkbox you have to click, and whether it spins for a while when you click it. That's a CAPTCHA now. And working is when your butt is in the chair. And investing is when you give someone money and they promise to give more back tater. And food is things that fit in your mouth and don't kill you. And free speech is when you get turned away at the border for disliking the president on social media. And top-of-the-line CPUs are ones that die within 24 months. Meanwhile the totalitarian dictatorship across the pond actually does all these things better somehow (except the politics). https://en.wikipedia.org/wiki/HyperNormalisation#Etymology

DoNotNotify is now Open Source

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

Haskell for all: Beyond agentic coding

Matchlock: Linux-based sandboxing for AI agents

SectorC: A C Compiler in 512 bytes (2023)

Reverse Engineering Raiders of the Lost Ark for the Atari 2600

LLMs as the new high level language

The Architecture of Open Source Applications (Volume 1) Berkeley DB

Software factories and the agentic moment

Modern and Antique Technologies Reveal a Dynamic Cosmos

Speed up responses with fast mode

LineageOS 23.2

Stories from 25 Years of Software Development

Hoot: Scheme on WebAssembly

uLauncher

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Vocal Guide – belt sing without killing yourself

Wood Gas Vehicles: Firewood in the Fuel Tank (2010)

Rabbit Ear "Origami": programmable origami in the browser (JS)

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

First Proof

Substack confirms data breach affects users’ email addresses and phone numbers

Start all of your commands with a comma (2009)

Al Lowe on model trains, funny deaths and working with Disney

The AI boom is causing shortages everywhere else

Where did all the starships go?

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

LLMs as Language Compilers: Lessons from Fortran for the Future of Coding

Show HN: A luma dependent chroma compression algorithm (image compression)

In the Australian outback, we're listening for nuclear tests

DoNotNotify is now Open Source

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

Haskell for all: Beyond agentic coding

Matchlock: Linux-based sandboxing for AI agents

SectorC: A C Compiler in 512 bytes (2023)

Reverse Engineering Raiders of the Lost Ark for the Atari 2600

LLMs as the new high level language

The Architecture of Open Source Applications (Volume 1) Berkeley DB

Software factories and the agentic moment

Modern and Antique Technologies Reveal a Dynamic Cosmos

Speed up responses with fast mode

LineageOS 23.2

Stories from 25 Years of Software Development

Hoot: Scheme on WebAssembly

uLauncher

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Vocal Guide – belt sing without killing yourself

Wood Gas Vehicles: Firewood in the Fuel Tank (2010)

Rabbit Ear "Origami": programmable origami in the browser (JS)

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

First Proof

Substack confirms data breach affects users’ email addresses and phone numbers

Start all of your commands with a comma (2009)

Al Lowe on model trains, funny deaths and working with Disney

The AI boom is causing shortages everywhere else

Where did all the starships go?

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

LLMs as Language Compilers: Lessons from Fortran for the Future of Coding

Show HN: A luma dependent chroma compression algorithm (image compression)

In the Australian outback, we're listening for nuclear tests

Lightweight open source reCaptcha alternative

Comments