frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Why are anime catgirls blocking my access to the Linux kernel?

https://lock.cmpxchg8b.com/anubis.html
33•taviso•2h ago

Comments

PaulHoule•2h ago
I think a lot of it is performative and a demonstration that somebody is a member of a tribe, particularly the part about the kemonomimi [1] (e.g. people who are kinda like furries but have better test in art)

[1] https://safebooru.donmai.us/posts?tags=animal_ears

dathinab•1h ago
you are overthinking

it's a simple as having a nice picture there make this whole thing feel nicer, and give it a bit of personality

so you put in some picture/art you like

that's it

similar any site sing it can change that picture, but there isn't any fundamental problem with the picture, so most can't care to change it

lxgr•1h ago
> This isn’t perfect of course, we can debate the accessibility tradeoffs and weaknesses, but conceptually the idea makes some sense.

It was arguably never a great idea to begin with, and stopped making sense entirely with the advent of generative AI.

yuumei•1h ago
> The CAPTCHA forces vistors to solve a problem designed to be very difficult for computers but trivial for humans. > Anubis – confusingly – inverts this idea.

Not really, AI easily automates traditional captchas now. At least this one does not need extensions to bypass.

Philpax•1h ago
The argument isn't that it's difficult for them to circumvent - it's not - but that it adds enough friction to force them to rethink how they're scraping at scale and/or self-throttle.

I personally don't care about the act of scraping itself, but the volume of scraping traffic has forced administrators' hands here. I suspect we'd be seeing far fewer deployments if the scrapers behaved themselves to begin with.

davidclark•1h ago
The OP author shows that the cost to scrape an Anubis site is essentially zero since it is a fairly simple PoW algorithm that the scraper can easily solve. It adds basically no compute time or cost for a crawler run out of a data center. How does that force rethinking?
hooverd•1h ago
The problem with crawlers if that they're functionally indistinguishable from your average malware botnet in behavior. If you saw a bunch of traffic from residential IPs using the same token that's a big tell.
Philpax•1h ago
The cookie will be invalidated if shared between IPs, and it's my understanding that most Anubis deployments are paired with per-IP rate limits, which should reduce the amount of overall volume by limiting how many independent requests can be made at any given time.

That being said, I agree with you that there are ways around this for a dedicated adversary, and that it's unlikely to be a long-term solution as-is. My hope is that the act of having to circumvent Anubis at scale will prompt some introspection (do you really need to be rescraping every website constantly?), but that's hopeful thinking.

anotherhue•1h ago
Surely the difficulty factor scales with the system load?
lousken•1h ago
aren't you happy? at least you see catgirl
jimmaswell•1h ago
What exactly is so bad about AI crawlers compared to Google or Bing? Is there more volume or is it just "I don't like AI"?
Philpax•1h ago
Volume, primarily - the scrapers are running full-tilt, which many dynamic websites aren't designed to handle: https://pod.geraspora.de/posts/17342163
jayrwren•1h ago
literally the top link when I search for his exact text "why are anime catgirls blocking my access to the Linux kernel?" https://lock.cmpxchg8b.com/anubis.html Maybe travis needs more google-fu. maybe that includes using duckduckgo?
ksymph•1h ago
This is neither here nor there but the character isn't a cat. It's in the name, Anubis, who is an Egyptian deity typically depicted as a jackal or generic canine, and the gatekeeper of the afterlife who weighs the souls of the dead (hence the tagline). So more of a dog-girl, or jackal-girl if you want to be technical.
rnhmjoj•1h ago
I don't understand, why do people resort to this tool instead of simply blocking by UA string or IP address. Are there so many people running these AI crawlers?

I blackholed some IP blocks of OpenAI, Mistral and another handful of companies and 100% of this crap traffic to my webserver disappeared.

hooverd•1h ago
less savory crawlers use residential proxies and are indistinguishable from malware traffic
WesolyKubeczek•1h ago
You should read more. AI companies use residential proxies and mask their user agents with legitimate browser ones, so good luck blocking that.
rnhmjoj•58m ago
Which companies are we talking about here? In my case the traffic was similar to what was reported here[1]: these are crawlers from Google, OpenAI, Amazon, etc. they are really idiotic in behaviour, but at least report themselves correctly.

[1]: https://pod.geraspora.de/posts/17342163

mnmalst•1h ago
Because that solution simply does not work for all. People tried and the crawlers started using proxies with residential IPs.
WesolyKubeczek•1h ago
I disagree with the post author in their premise that things like Anubis are easy to bypass if you craft your bot well enough and throw the compute at it.

Thing is, the actual lived experience of webmasters tells that the bots that scrape the internets for LLMs are nothing like crafted software. They are more like your neighborhood shit-for-brain meth junkies competing with one another who makes more robberies in a day, no matter the profit.

Those bots are extremely stupid. They are worse than script kiddies’ exploit searching software. They keep banging the pages without regard to how often, if ever, they change. If they were 1/10th like many scraping companies’ software, they wouldn’t be a problem in the first place.

Since these bots are so dumb, anything that is going to slow them down or stop them in their tracks is a good thing. Short of drone strikes on data centers or accidents involving owners of those companies that provide networks of botware and residential proxies for LLM companies, it seems fairly effective, doesn’t it?

fluoridation•1h ago
Hmm... What if instead of using plain SHA-256 it was a dynamically tweaked hash function that forced the client to run it in JS?
VMG•1h ago
crawlers can run JS, and also invest into running the Proof-Of-JS better than you can
fluoridation•1h ago
If we're presupposing an adversary with infinite money then there's no solution. One may as well just take the site offline. The point is to spend effort in such a way that the adversary has to spend much more effort, hopefully so much it's impractical.
tjhorner•27m ago
Anubis doesn't target crawlers which run JS (or those which use a headless browser, etc.) It's meant to block the low-effort crawlers that tend to make up large swaths of spam traffic. One can argue about the efficacy of this approach, but those higher-effort crawlers are out of scope for the project.
ksymph•1h ago
Reading the original release post for Anubis [0], it seems like it operates mainly on the assumption that AI scrapers have limited support for JS, particularly modern features. At its core it's security through obscurity; I suspect that as usage of Anubis grows, more scrapers will deliberately implement the features needed to bypass it.

That doesn't necessarily mean it's useless, but it also isn't really meant to block scrapers in the way TFA expects it to.

[0] https://xeiaso.net/blog/2025/anubis/

jhanschoo•1h ago
Your link explicitly says:

> It's a reverse proxy that requires browsers and bots to solve a proof-of-work challenge before they can access your site, just like Hashcash.

It's meant to rate-limit accesses by requiring client-side compute light enough for legitimate human users and responsible crawlers in order to access but taxing enough to cost indiscriminate crawlers that request host resources excessively.

It indeed mentions that lighter crawlers do not implement the right functionality in order to execute the JS, but that's not the main reason why it is thought to be sensible. It's a challenge saying that you need to want the content bad enough to spend the amount of compute an individual typically has on hand in order to get me to do the work to serve you.

ksymph•10m ago
Here's a more relevant quote from the link:

> Anubis is a man-in-the-middle HTTP proxy that requires clients to either solve or have solved a proof-of-work challenge before they can access the site. This is a very simple way to block the most common AI scrapers because they are not able to execute JavaScript to solve the challenge. The scrapers that can execute JavaScript usually don't support the modern JavaScript features that Anubis requires. In case a scraper is dedicated enough to solve the challenge, Anubis lets them through because at that point they are functionally a browser.

As the article notes, the work required is negligible, and as the linked post notes, that's by design. Wasting scraper compute is part of the picture to be sure, but not really its primary utility.

iefbr14•1h ago
I wouldn't be surprised if just delaying the server response by some 3 seconds will have the same effect on those scrapers as Anubis claims.
xena•9m ago
This same author also ignored the security policy and dropped an Anubis double-spend attack on the issue tracker. Their email got eaten by my spam filter so I didn't realize that I got emailed at all.

Fun times.

How harmful is blue light for sleep?

https://www.nytimes.com/2025/08/17/well/health-effects-blue-light-screen-use.html
1•bookofjoe•3m ago•1 comments

US Health Secretary Ends Decades of Research into Environmental Causes of Autism

https://www.propublica.org/article/rfk-jr-autism-environment-research-funding
1•klipt•3m ago•0 comments

CSS line-height unit 1h

https://caniuse.com/mdn-css_types_length_lh
1•Brajeshwar•3m ago•0 comments

American Millennials Are Dying at an Alarming Rate

https://slate.com/technology/2025/08/millennials-gen-z-death-rates-america-high.html
1•damien•4m ago•0 comments

The Four Stages of Objective-Smalltalk

https://blog.metaobject.com/2019/12/the-4-stages-of-objective-smalltalk.html
1•thunderbong•4m ago•0 comments

L2AW Theorem

https://law-theorem.com/
1•avinassh•5m ago•0 comments

The Pragmatic Engineer 2025 Survey: What's in your tech stack? Part 2

https://newsletter.pragmaticengineer.com/p/the-pragmatic-engineer-2025-survey-part-2
1•CharlesW•7m ago•0 comments

Dagger and opencode and agnostic agents and SSH app = most portable dev kit

2•epuerta99•9m ago•0 comments

Crash Cows

https://beza1e1.tuxen.de/lore/crash_cows.html
4•indrora•10m ago•0 comments

What went wrong with Social Media?

https://arun626588.substack.com/p/what-went-wrong-with-social-media
1•rohannihalani•11m ago•0 comments

Openwetware.org shut down due to funding

https://openwetware.org/
1•eldenring•11m ago•0 comments

James Webb Space Telescope runs an extended version of JavaScript [pdf]

https://www.stsci.edu/~idash/pub/dashevsky0607rcsgso.pdf
1•homebrewer•11m ago•0 comments

Travel eSIMs route traffic over Chinese and undisclosed networks: study

https://www.itnews.com.au/news/travel-esims-secretly-route-traffic-over-chinese-and-undisclosed-networks-study-619659
2•taubek•12m ago•0 comments

Cool or Hard

https://belief.horse/notes/cool-or-hard/
1•doctorhandshake•12m ago•0 comments

For decades, sleep has been passive

https://xcancel.com/dwdavison/status/1957972610202960005#m
1•palmfacehn•13m ago•0 comments

Notes on Image Generation with GPT-4.1

https://taoofmac.com/space/notes/2025/07/20/1230
1•rcarmo•14m ago•0 comments

The reason the West is warmongering against China

https://www.aljazeera.com/opinions/2025/8/3/the-real-reason-the-west-is-warmongering-against-china
3•Qem•14m ago•0 comments

Integrating Jenkins with AEM Deployments

https://aemslate.com/integrating-jenkins-with-aem-deployments
1•a-blank-slate•15m ago•0 comments

Disk Sampling on the Sphere

https://observablehq.com/@jrus/spheredisksample
3•jacobolus•15m ago•0 comments

Just Write

https://www.moll.dev/notes/justwrite/
2•mooreds•16m ago•0 comments

A proposal for inline LLM instructions in HTML based on llms.txt

https://vercel.com/blog/a-proposal-for-inline-llm-instructions-in-html
3•brycewray•16m ago•0 comments

Hx-optimistic: Declarative optimistic updates for Htmx

https://www.lorenstew.art/blog/hx-optimistic/
1•lorenstewart•17m ago•0 comments

Show HN: Yellhorn – MCP server to help coding agents 1-shot long tasks

https://github.com/msnidal/yellhorn-mcp
1•sravanjayanthi•22m ago•1 comments

REITs Buying Tranches of Single-Family Homes (2024)

https://finance.yahoo.com/news/other-side-hedge-funds-reits-180055854.html
3•danielam•24m ago•0 comments

ComputerRL: Scaling Reinforcement Learning for Computer Use Agents

https://arxiv.org/abs/2508.14040
1•cjbarber•26m ago•0 comments

Processing 24T tokens for LLM training with 0 crashes (what made it possible)

https://www.daft.ai/blog/how-essential-ai-built-essential-web-v1-with-daft
1•DISCURSIVE•28m ago•0 comments

Digg.com Is Back

https://www.digg.com/
38•thatgerhard•28m ago•25 comments

Show HN: A new JavaScript runtime for writing high-performance web apps in Rust

https://www.npmjs.com/package/brahma-firelight
1•StellaMary•29m ago•1 comments

Open Data Contract Standard

https://bitol-io.github.io/open-data-contract-standard/v3.0.2/
1•mooreds•31m ago•0 comments

Dmux: Claude Code Multiplexer (fleet management)

https://github.com/justin-schroeder/dmux
2•jpschroeder•32m ago•0 comments