frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Detect and crash Chromium bots

https://blog.castle.io/detect-and-crash-chromium-bots-with-one-weird-trick-bots-hate-it/
61•avastel•2d ago

Comments

lifthrasiir•3h ago
Previously on HN: Detecting Noise in Canvas Fingerprinting https://news.ycombinator.com/item?id=43170079

The reception was not really positive for the obvious reason at that time.

chrismorgan•3h ago
Checking https://issues.chromium.org/issues/340836884, I’m mildly surprised to find the report just under a year old, with no attention at all (bar a me-too comment after four months), despite having been filed with priority P1, which I understand is supposed to mean “aim to fix it within 30 days”. If it continues to get no attention, I’m curious if it’ll get bumped automatically in five days’ time when it hits one year, given that they do something like that with P2 and P3 bugs, shifting status to Available or something, can’t quite remember.

I say only “mildly”, because my experience on Chromium bugs (ones I’ve filed myself, or ones I’ve encountered that others have filed) has never been very good. I’ve found Firefox much better about fixing bugs.

oefrha•2h ago
> The call to page.evaluate just hangs, and the browser dies silently. browser.close() is never reached, which can cause memory leaks over time.

Not just memory leaks. Since a couple months ago, if you use Chrome via playwright etc. on macOS, it will deposit a copy of Chrome (more than 1GB) into /private/var/folders/kd/<...>/X/com.google.Chrome.code_sign_clone/, and if you exit without a clean browser.close(), the copy of Chrome will remain there. I noticed after it ate up ~50GB in two days. No idea what's the point of this code sign clone thing, but I had to add --disable-features=MacAppCodeSignClone to all my invocations to prevent it, which is super annoying.

closewith•1h ago
That's an open bug at the minute, but the one saving grace is that they're APFS clones so don't actually consume disk space.
oefrha•38m ago
Interesting, IIRC I did free up quite a bit of disk space when I removed all the clones, but I also deleted a lot of other stuff that time so I could be mistaken. du(1) being unaware of APFS clones makes it hard to tell.
omneity•2h ago
Relevant plug: At Herd we offer a browser automation and orchestration framework that uses real browsers and thus sidesteps several of these issues[0]. The API is puppeteer-like but doesn't use it as we built the entire framework[1] from scratch.

If you're wondering about the emphasis on MCPs, Herd is a generalist automation framework with a bespoke package format – trails[2], that supports MCP and REST out-of-the-box.

0: https://herd.garden

1: https://herd.garden/docs/reference

2: https://herd.garden/docs/trails-automations

---

EDIT: I understand not everyone likes a shameless plug in another thread. The intention behind it however is also informative, as not every browser automation strategy is subject to the issues as in TFA.

The title does say crashing Chromium bots, yet our approach creates "Chromium bots" that do not crash under this premise, providing a useful counter-example.

randunel•1h ago
How do you deal with the usual CF, akamai and other fingerprinting and blocking you? Or is that the customer's job to figure out?
omneity•1h ago
Thank you for the question! It depends on the scale you're operating at.

1. For individual use (or company use but each user is on their device) typically the traffic is drown out in regular user activity since we use the same browser and no particular measure is needed, it just works. We have options for power users.

2. For large scale use, we offer tailored solutions depending on the anti-bot measures encountered. Part of it is to emulate #1.

3. We don't deal with "blackhat bots", so we don't offer support to work around legitimate anti-bot measures such as social spambots etc.

lyu07282•1h ago
If you don't put significant effort into it, any headless browser from cloud IP ranges will be banned by large parts of the internet. This isn't just about spam bots, you can't even read news articles in many cases. You will have some competition from residential proxies and other custom automation solutions that take care of all of that for their customers.
omneity•1h ago
Thanks, that's so true! We learned this the hard way building Monitoro[0] and large data scraping pipelines in the past, so we had the opportunity to build up the required muscle.

One thing to note, there are different "tiers" of websites, each requiring different counter-measures. Not everyone is pursuing the high competition websites, and most importantly as we learned in several cases scraping is fully consensual or within the rights of the user. For example:

* Many of our users scrape their own websites to send notifications to their discord community. It's a super easy way to create alerts without code.

* Sometimes users are locked in their own providers, for example some companies have years of job posting information in their ATS they cannot get out. We do help with that.

* Public data websites who are underutilized precisely because the data is difficult to access. We help make that data operational and actionable. We had for example a sailor setup alerts on buoys to stay safe in high waters. A random example[1]

0: https://monitoro.co

1: https://wavenet.cefas.co.uk/details/312/EXT

LTXVideo 13B AI video generation

https://ltxv.video/
57•zoudong376•1h ago•6 comments

The Cult of Doing Business

https://www.commonwealmagazine.org/calvert-work-entrepreneur-ethic-baker-review-job
11•Caiero•34m ago•1 comments

The Fallacy of Techno-Feudalism

https://petrapalusova.com/articles/tech-platforms-digital-economy-techno-feudalism
35•gasull•1h ago•34 comments

Vision Now Available in Llama.cpp

https://github.com/ggml-org/llama.cpp/blob/master/docs/multimodal.md
335•redman25•9h ago•78 comments

Radxa Orion O6 brings Arm to the midrange PC (with caveats)

https://www.jeffgeerling.com/blog/2025/radxa-orion-o6-brings-arm-midrange-pc
13•goranmoomin•58m ago•5 comments

Spanish Shipwreck Reveals Evidence of Earliest Known Pet Cats to Arrive in US

https://www.smithsonianmag.com/smart-news/spanish-shipwreck-reveals-evidence-of-earliest-known-pet-cats-to-arrive-in-the-united-states-180986560/
31•wallflower•3d ago•12 comments

Private Japanese lunar lander enters orbit around moon ahead of a June touchdown

https://phys.org/news/2025-05-private-japanese-lunar-lander-orbit.html
108•pseudolus•3d ago•12 comments

Loss of dance and infant-directed song among the Northern ACHé

https://www.cell.com/current-biology/fulltext/S0960-9822(25)00447-6
16•PaulHoule•3d ago•0 comments

Detect and crash Chromium bots

https://blog.castle.io/detect-and-crash-chromium-bots-with-one-weird-trick-bots-hate-it/
61•avastel•2d ago•10 comments

Slow software for a burning world

https://bonfirenetworks.org/posts/slow_software_for_a_burning_world/
98•todsacerdoti•6h ago•55 comments

Gmail to SQLite

https://github.com/marcboeker/gmail-to-sqlite
158•tehlike•8h ago•39 comments

Henry James's family tried to keep him in the closet (2016)

https://www.theguardian.com/books/2016/feb/20/colm-toibin-how-henry-james-family-tried-to-keep-him-in-the-closet
4•benbreen•34m ago•2 comments

Business books are entertainment, not strategic tools

https://theorthagonist.substack.com/p/why-reading-business-books-is-a-waste
400•ZeroTalent•16h ago•181 comments

Industry groups are not happy about the imminent demise of Energy Star

https://insideclimatenews.org/news/08052025/energy-star-program-could-be-eliminated-by-trump-administration/
16•Tomte•39m ago•1 comments

A simple 16x16 dot animation from simple math rules

https://tixy.land
140•andrewrn•10h ago•31 comments

Embracer Games Archive is preserving 75000 video games and needs contributions

https://embracergamesarchive.com/
6•draugadrotten•1h ago•4 comments

ALICE detects the conversion of lead into gold at the LHC

https://www.home.cern/news/news/physics/alice-detects-conversion-lead-gold-lhc
574•miiiiiike•22h ago•283 comments

In praise of grobi for auto-configuring X11 monitors

https://michael.stapelberg.ch/posts/2025-05-10-grobi-x11-monitor-autoconfig/
33•secure•6h ago•0 comments

The Deathbed Fallacy

https://www.hjorthjort.xyz/2018/02/21/the-deathbed-fallacy.html
15•mefengl•3h ago•5 comments

Cosmos 482 Descent Craft tracker

http://astria.tacc.utexas.edu/AstriaGraph/
21•Kaibeezy•4h ago•6 comments

Internet Roadtrip: Vote to steer

https://neal.fun/internet-roadtrip/
163•memalign•3d ago•32 comments

Intel: Winning and Losing

https://www.abortretry.fail/p/intel-winning-and-losing
8•rbanffy•1h ago•0 comments

QueryLeaf: SQL for Mongo

https://github.com/beekeeper-studio/queryleaf
3•tilt•1d ago•1 comments

Ash (Almquist Shell) Variants

https://www.in-ulm.de/~mascheck/various/ash/
22•thefilmore•5h ago•2 comments

How much information is in DNA?

https://dynomight.substack.com/p/dna
23•crescit_eundo•1d ago•17 comments

WebGL Water (2010)

https://madebyevan.com/webgl-water/
174•gaws•13h ago•49 comments

Brandon's Semiconductor Simulator

https://brandonli.net/semisim/
161•dominikh•12h ago•19 comments

Rust’s dependencies are starting to worry me

https://vincents.dev/blog/rust-dependencies-scare-me/?
343•chaosprint•1d ago•430 comments

Sofie: open-source web based system for automating live TV news production

https://nrkno.github.io/sofie-core/
339•rjmunro•23h ago•42 comments

Fleurs du Mal

https://fleursdumal.org
130•Frummy•14h ago•47 comments