frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

LLM scraper bots are overloading acme.com's HTTPS server

http://acme.com/updates/archive/229.html
26•mjyut•2h ago

Comments

davidsojevic•1h ago
I suspect part of the issue is that people are still using things like `acme.com` and `demo.com` as an example domain in their documentation and tests instead of relying on `example.com` which is reserved exactly for this purpose [0]

[0]: https://www.iana.org/domains/reserved

Frieren•1h ago
> The LLM companies are not picking on me in particular, they are pounding every site on the net.

Why is not this a criminal offense? They are hurting business for profit (or for higher valuation as they probably have no profit at all).

Why are corporations allowed to do with impunity what could land even a teenager years in prison? Is there no rule of law anymore?

The five-year and ten-year penalties kick in only when the government can show the offense caused at least $5,000 in losses across all victims during a one-year period. https://legalclarity.org/what-are-the-punishments-for-a-ddos...

tempest_•1h ago
Because might makes right and any entity with the power to legally put up a fight is in on the game (or wants to be)
heavyset_go•1h ago
We've already established that computer crime and IP laws apply to normies and not tech companies
budududuroiu•1h ago
Normative vs prerogative state [1]. See US v. Swartz compared to Meta use of LibGen for Llama

[1] https://en.wikipedia.org/wiki/Dual_state_(model)

dannyobrien•23m ago
So, I knew Aaron and I definitely would not presume to predict what he would have thought, but I’d point out there is a sizeable state space where he should never have been prosecuted, and scraping by others including large commercial companies should not prosecutable on the same grounds.

I repeat what Aaron’s friends and lawyers said at the time: we were going to fight that case, and we were going to win.

avazhi•51m ago
Is what an offence lol? Bot scraper traffic?

How do you think search engines work?

will4274•43m ago
It's a bit more like a physical business with a "public welcome" policy like a coffee shop going viral and then having tens of thousands of people walking in and taking pictures but not buying coffee. It's disruptive, but not illegal.

Acme.com is welcome to require authentication for all pages but their home page, which would quickly cause the traffic to drop. They don't want to do this - like the coffee shop, they want to be open to public, and for good reasons.

Sometimes the use profile changes dramatically in a short time. 15 years ago, Netflix created the video streaming market and shared bandwidth capacity that had been excessive before wasn't enough. 15 years before that, Google did the same thing when they created search and started driving tremendous traffic to text based websites which had spread through word of mouth before.

Turns out the micro transaction people probably had the right idea.

legohead•35m ago
adapt or die

waiting on the govt to do something is a path of failure

reddozen•29m ago
Because the law deals with intent. The intent for a 12 year old skiddie with a ddos box is to harm someone else's internet. the intent of big scrapers is to collect data. if you want to make the latter illegal then vote for that instead of loading it with the normative baggage of the former.

It's the same problem as why Occupy Wallstreet fell apart: bunch of losers who don't understand the system screech about the system. because they don't understand it, they can't offer any meaningful dialogue about how to fix it beyond screeching.

JohnTHaller•1h ago
Series of Chinese LLM scrapers kept PortableApps.com running slow and occasionally unresponsive for 2 weeks.
superkuh•1h ago
There are plenty of local LLMs out there run by humans that play nice. It's not the LLMs that are the problem. It's the corporations. That's the commonality. Human people aren't doing this. These corporate legal persons are a much more dangerous and capable form of non-human intelligence with non-human motives than LLMs (which are not doing the scraping or even calling the tools which are sending the HTTP requests). And they have lobbied their way to legal immunity to most of their crimes.
happyopossum•59m ago
> Human people aren't doing this

Who do you think writes these scrapers? Well, I mean aside from the vibe coded ones.

chupchap•54m ago
Bot traffic is crazy even for smaller sites, but still manageable. I was getting 2,000 visitors a day on my infrequently updated website, but after I blocked all the bots via Cloudflare it went back to the normal double digit visitor count.
avazhi•52m ago
> Someone really ought to do something about it.

What is bro proposing here?

kristianp•43m ago
> Nearly all of them were for non-existent pages.

Do any webservers have a feature where they keep a list in memory of files/paths that exist?

arjie•21m ago
The only real solution is to put Anubis in front. For me, I just use Cloudflare in front and that suffices. But it's only a few thousand per hour by default. My homeserver can handle that quite well on its own.

Project Glasswing: Securing critical software for the AI era

https://www.anthropic.com/glasswing
1074•Ryan5453•11h ago•487 comments

Lunar Flyby

https://www.nasa.gov/gallery/lunar-flyby/
527•kipi•14h ago•122 comments

Protect Your Shed

https://dylanbutler.dev/blog/protect-your-shed/
50•baely•2h ago•7 comments

Slightly safer vibecoding by adopting old hacker habits

http://addxorrol.blogspot.com/2026/03/slightly-safer-vibecoding-by-adopting.html
46•transpute•5d ago•21 comments

Native Americans had dice 12,000 years ago

https://www.nbcnews.com/science/science-news/native-americans-dice-games-probability-study-rcna26...
25•delichon•4d ago•4 comments

System Card: Claude Mythos Preview [pdf]

https://www-cdn.anthropic.com/53566bf5440a10affd749724787c8913a2ae0841.pdf
617•be7a•10h ago•443 comments

GLM-5.1: Towards Long-Horizon Tasks

https://z.ai/blog/glm-5.1
470•zixuanlimit•12h ago•188 comments

Binary obfuscation used in AAA Games

https://blog.farzon.org/2026/04/binary-obfuscation-that-doesnt-kill-lto.html
49•noztol•2d ago•10 comments

How to get better at guitar

https://www.jakeworth.com/posts/how-to-get-better-at-guitar/
248•jwworth•2d ago•121 comments

S3 Files

https://www.allthingsdistributed.com/2026/04/s3-files-and-the-changing-face-of-s3.html
251•werner•9h ago•70 comments

Cambodia unveils statue to honour famous landmine-sniffing rat

https://www.bbc.com/news/articles/c0rx7xzd10xo
330•speckx•11h ago•68 comments

A truck driver spent 20 years making a scale model of every building in NYC

https://www.smithsonianmag.com/smart-news/a-truck-drive-spent-20-years-making-this-astonishing-sc...
286•1659447091•1d ago•46 comments

Show HN: An interactive map of Tolkien's Middle-earth

https://middle-earth-interactive-map.web.app/
132•frasermarlow•8h ago•27 comments

A database of analog cameras that can be 3D printed

https://printed.analogcamera.space/
69•thomasjb•4d ago•7 comments

US and Iran agree to provisional ceasefire

https://www.theguardian.com/us-news/2026/apr/07/trump-iran-war-ceasefire
350•g-b-r•6h ago•942 comments

Show HN: Brutalist Concrete Laptop Stand (2024)

https://sam-burns.com/posts/concrete-laptop-stand/
725•sam-bee•18h ago•222 comments

The Clock

https://blog.senko.net/the-clock
38•senko•3d ago•6 comments

ACE on a USB→HDMI Adapter

https://blazelight.dev/blog/ms2160.mdx
4•theblazehen•3d ago•0 comments

Xilem – An experimental Rust native UI framework

https://github.com/linebender/xilem
51•Levitating•5h ago•14 comments

Cloudflare targets 2029 for full post-quantum security

https://blog.cloudflare.com/post-quantum-roadmap/
301•ilreb•15h ago•95 comments

A whole boss fight in 256 bytes

https://hellmood.111mb.de//A_whole_boss_fight_in_256_bytes.html
80•HellMood•2d ago•19 comments

JSIR: A High-Level IR for JavaScript

https://discourse.llvm.org/t/rfc-jsir-a-high-level-ir-for-javascript/90456
30•nnx•4h ago•7 comments

Show HN: Gemma 4 Multimodal Fine-Tuner for Apple Silicon

https://github.com/mattmireles/gemma-tuner-multimodal
145•MediaSquirrel•9h ago•22 comments

LLM scraper bots are overloading acme.com's HTTPS server

http://acme.com/updates/archive/229.html
26•mjyut•2h ago•17 comments

Rescuing old printers with an in-browser Linux VM bridged to WebUSB over USB/IP

https://printervention.app/details
170•gmac•12h ago•77 comments

The Image Boards of Hayao Miyazaki

https://animationobsessive.substack.com/p/the-image-boards-of-hayao-miyazaki
136•vinhnx•1d ago•13 comments

Bitcoin and quantum computing

https://nehanarula.org/2026/04/03/bitcoin-and-quantum-computing.html
124•nehan•8h ago•88 comments

Google open-sources experimental agent orchestration testbed Scion

https://www.infoq.com/news/2026/04/google-agent-testbed-scion/
176•timbilt•15h ago•48 comments

Running out of disk space in production

https://alt-romes.github.io/posts/2026-04-01-running-out-of-disk-space-on-launch.html
175•romes•4d ago•89 comments

A blind man made it possible for others with low vision to build Lego sets

https://apnews.com/article/lego-bricks-for-blind-audio-braille-instructions-5a2a27de4354a0b144317...
66•speckx•14h ago•5 comments