Sometimes CPU cores are odd – Anubis

https://anubis.techaro.lol/blog/2025/cpu-core-odd/

33•rbanffy•2h ago

Comments

ranger_danger•1h ago

> In retrospect implementing the proof of work challenge may have been a mistake

Why?

What would the alternative have been?

tux3•1h ago

It does two things: Force everyone (including scrapers) to run a real JS engine, and force everyone to solve the challenge.

The first effect is great, because it's a lot more annoying to bring up a full browser environment in your scraper than just run a curl command.

But the actual proof of work only takes about 10ms on a server in native code, while it can take multiple seconds on a low-end phone. Given the companies in questions are building entire data centers to house all their GPUs, an extra 10ms per web-page is not a problem for them. They're going to spend orders of magnitude more compute actually training on the content they scraped, than solving the challenge.

It's mostly the inconvenience of adapting to Anubis's JS requirements that held them back for a while, but the PoW difficulty mostly slowed down real users.

alright2565•1h ago

In part because this particular proof of work is absolutely trivial at scale, with commercial hardware able to do 390TH/s, while your typical phone would only be able to do a million and still have acceptable latency.

MBCook•1h ago

They also suggest maybe “proof of React” would be better with a link to this rough proof of concept:

https://github.com/TecharoHQ/anubis/pull/1038

Could someone explain how this would help stop scrapers? If you’re just running the page JS wouldn’t this run too and let you through?

fluoridation•1h ago

Low-effort scrapers don't run JS, they just fetch static content.

MBCook•40m ago

But then they couldn’t get past the current Anubis. Sonia the idea it would just be cheaper for clients?

fluoridation•30m ago

That's the idea. Impose software requirements on the client instead of computational requirements.

ranger_danger•1h ago

They admitted that this was a 'shitpost'.

> how this would help stop scrapers

I think anubis bases its purpose on some flawed assumptions:

- that most scrapers aren't headless browsers

- that they don't have access to millions of different IPs across the world from big/shady proxy companies

- that this can help with a real network-level DDoS

- that scrapers will give up if the requests become 'too expensive'

- that they aren't contributing to warming the planet

I'm sure there does exist some older bots that are not smart and don't use headless browsers, but especially with newer tech/AI crawlers/etc., I don't think this is a realistic majority assumption anymore.

zetanor•1h ago

In practice, any automated work that a real user is willing to wait through will be trivial to accomplish for an organization which scrapes the entire Internet. The real weight behind Anubis is the Javascript gate, not the PoW. It might as well just fetch() into browser.cookies.set().

jsnell•1h ago

An unavoidable aspect of abuse problems is that there is no perfect solution. As the defender, you’re always making a precision vs. recall tradeoff. After you’ve picked off the low hanging fruit, most of the time the only way to increase recall (i.e. catch more abuse) is by reducing the precision (i.e. having more false positives, where a good user is falsely considered an abuser).

In an adversarial engineering domain neither the problems or solutions are static. If by some miracle you have a perfect solution at one point in time, the adversaries will quickly adapt, and your solution stops being perfect.

So you’ll mostly be playing the game in this shifting gray area of maybe legit, maybe abusive cases. Since you can’t perfectly classify them (if you could, they wouldn’t be in the gray area), the options are basically to either block all of them, allow all of them, or issue them a challenge that the user must pass to be allowed. The first two options tend to be unacceptable in the gray area, so issuing a challenge that the client must pass is usually the preferred option.

A good counter-abuse challenge is something that has at least one of the following properties:

1. It costs more to pass than the economic value that the adversary can extract from the service, but not so much that the legitimate users won’t be willing to pay it.

2. It proves control of a scarce resource without necessarily having to spend that resource, but at least in such a way that the same scarce resource can’t be used to pass unlimited challenges.

3. It produces additional signals that can be used to meaningfully improve the precision/recall tradeoff.

And proof of work does none of those. The last two by construction, since compute is about the most fungible resource in the world. The last doesn't work since it's impossible to balance the difficulty factor such that it imposes a cost the attacker would notice but would be acceptable to the defender.

If you add 10s to the latency for your worst-case real users (already too long), it'll cost about $0.01/1k solves. That's not a deterrent to any kind of abuse.

So proof of work just is a really bad fit for this specific use case. The only advantage is that it is easy to implement, but that's a very short term benefit.

DiabloD3•1h ago

Wait, the Anubis people _didn't know_ 3 core machines were sold for years? AMD was famous for it!

Sesse__•1h ago

What about... single-core machines?

nerdsniper•55m ago

SMT generally caused single-core CPU's to appear as 2 logical cores.

I realize Anubis was probably never tested on a true single-core machine. They are actually somewhat difficult to find these days outside of microcontrollers.

john-h-k•40m ago

The line of code in the article is `Math.max(nproc / 2, 1)`. So 1 core yields 1 thread. Only CPUs with an odd number of cores, no SMT, and >1 core will hit this bug. Not very common

nerdsniper•56m ago

In their testing, even with odd numbers of physical cores, SMT caused an even number of logical cores. Some phones didn't have SMT, and also had an odd number of physical cores, but this was genuinely rare.

Also, they still might not (but probably learned). In this article they imply that each type of CPU core (what they call a "tier" in the article) will still be a power of two, and one just happened to be 2^0. I'm not sure they were around when the AMD Athlon II X3 was hot.

>>> Today I learned this was possible. This was a total "today I learned" moment. I didn't actually think that hardware vendors shipped processors with an odd number of cores, however if you look at the core geometry of the Pixel 8 Pro, it has three tiers of processor cores. I guess every assumption that developers have about CPU design is probably wrong.

Joker_vD•1h ago

I think I actually saw a question on SO way back during the Windows Vista era when some guy asked if Windows supported machines with odd number of cores/processors, and the answer was "well, 1 is an odd number, you know".

Another joke from the same era: Having a 2 core processor means that you can now e.g. watch a film at the same time. At the same time with what? At the same time with running Windows Vista!

creatonez•47m ago

Sure, but 1 is also a power of 2:

2^0 = 1

So the logic might make sense in people's heads if they never encounter 6 or 12 core CPUs that are common these days.

MindSpunk•19m ago

Even long ago we had the AMD Phenom X3 chips which were 3 cores.

jsheard•5m ago

The fun thing about those is they were physically quad cores with one core disabled, which may or may not have been defective, so if you were lucky you could unlock it and get a bonus core for free.

perching_aix•41m ago

The whole Anubis thing is a really interesting predicament for me.

I have Chrome on mobile configured as such that JS and cookies are disabled by default, and then I enable them per site based on my judgement. You might be surprised to learn that normally, this actually works fine, and sites are usually better for it. They stop nagging, and load faster. This makes some sense in retrospect, as this is what allows search engine crawlers to do their thing and get that SEO score going.

Anubis (and Cloudflare for that matter) force me to temporarily enable JS and cookies at least once however anyways, completely defeating the purpose of my paranoid settings. I basically never bother to, but I do admit it is annoying. It's kind of up there with sites that don't have any content by default, only with JS on (high profile example: AWS docs). At least Cloudflare only spoils the fun every now and then. With Anubis, it's always.

It's definitely my fault, but at the same time, I don't feel this is right. Simple static pages now require allowing arbitrary code execution and statefulness. (Although I do recognize that SVGs and fonts also kind of do so anyhow, much to my further annoyance).

altairprime•11m ago

We have nothing to protect sites against scrapers except to make it more expensive for everyone’s, unless privacy-compromising methods are on the table.

Making you pay time, power, bandwidth, or money to access content does not significantly impede your browsing, so long as the cost is appropriately small. For the user above reporting thirty seconds of maxcpu, that’s excessive for a median normal person (but us hackers are not that).

If giving your unique burned-in crypto-attested device ID is acceptable, there’s an entire standard for that, and when your device is found to misbehave, your device can be banned. Nintendo, Sony, Xbox call this a “console ban”; it’s quite effective because it’s stunningly expensive.

If submitting proof of citizenship through whatever anonymous-attestation protocol is palatable is okay, the Anubis could simply add the digital ID web standard and let users skip the proof of work in exchange for affirming that they have a valid digital ID. But this only works if the identity can be banned, or else AI crawlers will just send a valid anonymized digital ID header.

This problem repeats in every suggested outcome: either you make it more difficult for users to access a site, or you require users to waste energy to access a site, or you require identifiable information signed by a dependable third-party authority to be presented such that a ban is possible based on it. IP addresses don’t satisfy this; Apple IDs, immutable HSM-protected device identifiers, and digital passports do satisfy this.

If you have a solution that only presents barriers to excessive use and allows abusive traffic to be revoked without depending on IP address, browser fingerprint, or paid/state credentials, then you can make billions of dollars in twelve months.

Ideas welcome! This has been a problem since bots started scraping RSS feeds and republishing them as SEO blogs, and we still don’t have a solution besides Cloudflare and/or CPU-burning interstitials.

Filligree•26m ago

Ironically, this sat on the intermission page for a good half-minute while my fans spun up. Then I gave up; it was eating the battery.

ddulaney•24m ago

Can I ask what hardware you’re using? I’ve heard similar things on the internet generally, but I’m on a several-years-old phone and it took under a second. Is the interstitial really that slow on some setups?

Filligree•21m ago

I do a lot of random browsing on an old iPad. Which doesn't have fans, I know, that was short for "it got really hot".

I'm not sure what generation it is, but I bought it around a decade ago I think.

dmitrygr•11m ago

> I guess every assumption that developers have about CPU design is probably wrong.

Javascripters, perhaps. Those who work on schedulers, or kernels in general would find this completely normal

Increased autonomic activation in vicarious embarrassment (2012) [pdf]

Two Chinese Nationals Arrested for Allegedly Illegal Shipping AI Chips to China

'Universal' Cancer Vaccine Destroys Resistant Tumors in Mice

BCHS Stack: BSD, C, httpd, SQLite

Ask HN: How to teach a 4 year old to code?

F1 in Hungary: Strategy and fast tire changes make all the difference

Bad Craziness

The Rise of Computer Use and Agentic Coworkers

We Oops-Proofed Infrastructure Deletion on Railway

From the 'Banter Bill' to Bias Hotlines: The Alarming Rise of Snitch Networks

The latest Covid vaccines come with restrictions

The big idea: Turn lobbying into a high-stakes financial market

Mainframe upgrade done with wire cutters (2015)

Python: The Documentary

Show HN: DeepShot – an open-source NBA predictor with ML, EWMA, and live UI

ExxonMobil Global Outlook: Our view to 2050

Show HN: oLLM – LLM Inference for large-context tasks on consumer GPUs

Should We Anthropomorphize LLMs?

Prompt Engineering for Grok Code Fast 1

All Revenue Is Not Created Equal (2011)

Ask HN: How much better can the LLMs become assuming no AGI

The A.I. Talent Wars

Garmin Blaze Equine Wellness System

RSS Is Awesome

Why HyperCard Had to Die (2011)

Hygiene Hypothesis

The ABC Programming Language

The Electro-Industrial Stack Will Move the World

US withdraws from UN human rights report

I've always wanted to be an open-source maintainer- now I regret it