Websites and web developers mostly don't care about client-side problems

https://utcc.utoronto.ca/~cks/space/blog/web/WebsitesDontCareAboutClients

35•zdw•9h ago

Comments

decremental•3h ago

It's not that website owners don't care that they're frustrating users, losing visitors and customers, or creating a poor experience. It's an intractable problem for most website owners to combat the endless ways that their sites are being botted and bogged down, and having to pay for resources to handle the 98% of traffic their sites are getting that isn't coming from real users and customers. By all means, solve it and everyone will be happy.

nottorp•2h ago

Heh. Who asked those website owners to have laggy scrolling, non existent contrast, hijack my back button, generally run so much javascript that a cluster is needed client side just to display a 3 line LLM generated blog post?

nulbyte•1h ago

This. It seems every website these days needs Javascript enabled just to load static content that could have been loaded between the time I hovered over a link and clicked it.

terminalshort•2h ago

How real is this "crawler plague" that the author refers to? I haven't seen it. But that's just as likely to because I don't care, and therefore am not looking, as it is to be because it's not there. Loading static pages from CDN to scrape training data takes such minimal amounts of resources that it's never going to be a significant part of my costs. Are there cases where this isn't true?

snowwrestler•2h ago

Yes, it’s true. Most sites don’t have a forever cache TTL so a crawler that hits every page on a database-backed site is going to hit mostly uncached pages (and therefore the DB).

I also have a faceted search that some stupid crawler has spent the last month iterating through. Also mostly uncached URLs.

n3storm•2h ago

Yeah, or an event plugin where spiders walks every day of several years...

n3storm•2h ago

My estimation is at least 70% of traffic on small sites 300-3000 daily views, is not human

danaris•1h ago

It's very real. It's crashed my site a number of times.

ApeWithCompiler•1h ago

The following is the best I could collect quickly to provide backup to the statement. Unfortunally it's not the high quality first instance of raw statistics I would have liked.

But from what I have read time to time the crawler acted magnitudes outside of what could have been accepted as just badly configured.

https://herman.bearblog.dev/the-great-scrape/

https://drewdevault.com/2025/03/17/2025-03-17-Stop-externali...

https://lwn.net/Articles/1008897/

https://tecnobits.com/en/AI-crawlers-on-Wikipedia-platform-d...

https://boston.conman.org/2025/08/21.1

zzzeek•38m ago

I just had to purchase a cloudflare account to protect two of my sites used for CI that run Jenkins and Gerrit servers. These are resource-hungry java VMs which I have running on a minimally powered server as they are intended to be accessed only by a few people, yet crawlers located in eastern Europe and Asia eventually found it and would regularly drive my CPU up to 500% and make the server unavailable (it should go without saying I have always had a robots.txt on these sites that prohibit all crawling. Such files are a quaint relic of a simpler time). For a couple of years I'd block the various offending IPs, but this past month the crawling resumed again this time intentionally swarmed across hundreds of IP numbers so that I could not easily block them. Cloudflare was able to show me within minutes the entirety of the IP numbers came from a single ASN owned by a very large and well known Chinese company and I blocked the entire ASN. While I could figure out these ASNs manually and get blocklists to add to apache config, Cloudflare makes it super easy showing you the whole thing happening in realtime. You can even tailor the 403 response to send them a custom message, in my case, "ALL of the data you are crawling is on github! Get off these servers and go get it there!" (again sure I could write out httpd config for all of that but who wants to bother). They are definitely providing a really critical service.

cm2187•32m ago

Particularly if your users are keen on solving recaptchas over and over.

hombre_fatal•29m ago

My forum traffic went up 10x due to bots a few months ago. Never seen anything like it.

> Loading static pages from CDN to scrape training data takes such minimal amounts of resources that it's never going to be a significant part of my costs. Are there cases where this isn't true?

Why did you bring up static pages served by a CDN, the absolute best case scenario, as your reference for how crawler spam might affect server performance?

azeemba•2h ago

The author is suggesting that websites care more about server side issues than client side issues. To the point that they don't realize that users stop using them.

I think that statement is way too strong and obviously not true of businesses. It might be true if hobbyist websites where the creator is personally more interested on the server side but it's definitely not true of professional websites.

Professional websites that have enough of a budget to care about the server side will absolutely care about the client side and will track usage. If 10% fewer people used the website, the analytics would show that and there would be a fire drill.

What I can agree with on the author is more of a nuanced point. Client side problems are a lot harder and have a very long tail due to unique client configurations (OS, browser, extensions, physical hardware). So with thousands of combinations, you end up with some wild and rare issues. It becomes hard to chase all of them down and some you just have to ignore.

This can lead to it feeling like websites don't care about client side but it just shows client side is hard.

carlosjobim•2h ago

> I think that statement is way too strong and obviously not true of businesses

Amazon.com Inc is currently worth 2.4 billion dollars and the only reason is that most businesses insist on giving their customers the worst online experience possible. I wish that I could one day understand the logic, which goes like this:

1. Notice that people are on their phones all the time.

2. And notice that when people are looking to buy something they first go on the computer or on the smart phone.

3. Therefore let's make the most godawful experience on our website possible, to make sure that our potential customers hate us and don't make a purchase.

4. Customers make their purchase on Amazon instead.

5. Profit??

danaris•1h ago

> Amazon.com Inc is currently worth 2.4 billion dollars and the only reason is that most businesses insist on giving their customers the worst online experience possible.

This is an incredibly reductive view of how Amazon came to dominate online retail. If you genuinely believe this, I would strongly urge you to research their history and understand how they became the monopoly they are today.

I assure you, it's not primarily because they care more about the end user's experience.

carlosjobim•1h ago

It's just an example, and it holds true even if it's reductive. If businesses made just 5% of the effort with their online experience as they do with their physical stores or social media campaigning, then they would see massive returns on effort.

nulbyte•1h ago

I...don't have this experience. It doesn't hold true for me, and I suspect I am not alone. There are certainly some online stores that are not very great, but by and large, I just don't have problems with them. I prefer the seller's website over Amazon.

Amazon, on the other hand, is plagued with fake or bad products from copycat sellers. I have no idea what I am going to get when I place an order. Frankly, I'm surprised when I get the actual thing I ordered.

zzzeek•44m ago

it's still the case today, in 2025, that when I bought a Focusrite 18i20 mixer from Sweetwater that turned out to be defective, I had to spend a week with a lengthy and super-long-delayed conversation with their support department convincing them that the unit was in fact defective, that I was using it correctly, and finally getting the prized RMA to return it. Whereas if I had bought it from Amazon, I would have received the original package more quickly, and when defective, I could have had it in a box and shipped off from any local shipper that same day with no emails/phone calls required with a new one to arrive the next day. Amazon even as the leader in "enshittification" still offers a dramatically better experience for a wide range of products (though certainly not all of them).

bellgrove•59m ago

Respectfully, this argument reads like it is completely ignorant of the e-commerce landscape over the past 30 years and how much Amazon has shaped and innovated in the space. Not to mention that today they have several verticals beyond e-commerce that make up their valuation.

carlosjobim•33m ago

Okay go on and count only half for the sake of argument. That's still a trillion. Any business can do what Amazon does for their products and their customers. But they don't and they won't. Those who do experience great advantages.

jerbearito•1h ago

> Amazon.com Inc is currently worth 2.4 billion dollars and the only reason is that most businesses insist on giving their customers the worst online experience possible

Huh?

jt2190•1h ago

The “client-side problems” Siebenmann is talking about are the various anti-bot measures (CAPTCHAs, rate limiters, etc.) that operators put in place that make the end user experience worse. Operators feel that they have no choice but to keep their servers available, thus they “don’t care”.

He makes a statement in an earlier article that I think sums things up nicely:

> One thing I've wound up feeling from all this is that the current web is surprisingly fragile. A significant amount of the web seems to have been held up by implicit understandings and bargains, not by technology. When LLM crawlers showed up and decided to ignore the social things that had kept those parts of the web going, things started coming down all over the place.

This social contract is, to me, built around the idea that a human will direct the operation of a computer in real time (largely by using a web browser and clicking links) but I think that this approach is extremely inefficient of both the computer’s and the human’s resources (cpu and time, respectively). The promise of technology should not be to put people behind desks staring at a screen all day, so this evolution toward automation must continue.

I do wonder what the new social contract will be: Perhaps access to the majority of servers will be gated by micropayments, but what will the “deal” be for those who don’t want to collect payments? How will they prevent abuse while keeping access free?

[1] “The current (2025) crawler plague and the fragility of the web”https://utcc.utoronto.ca/~cks/space/blog/web/WebIsKindOfFrag...

Retr0id•34m ago

Another implicit social contract is that you can tell whether a request is coming from a commercial or non-commercial source based on the originating ISP. This was always a heuristic but it was more reliable in the past.

If 1000 AWS boxes start hammering your API you might raise an eyebrow, but 1000 requests coming from residential ISPs around the world could be an organic surge in demand for your service.

Residential proxy services break this - which has been happening on some level for a long time, but the AI-training-set arms race has driven up demand and thus also supply.

It's quite easy to block all of AWS, for example, but it's less easy to figure out which residential IPs are part of a commercially-operated botnet.

hombre_fatal•38m ago

I don't really get what this article is talking about nor the distinctions that it's trying to draw between server and client. It brings up multiple different things from captcha to actual client performance so it's not clear what "problems" means in the title nor TFA.

The author needs to open with a paragraph that establishes better context. They open with a link to another post where they talk about anti-LLM defenses but it doesn't clarify what they are talking about when they compare server problems with client-side problems.

RFC 9839 and Bad Unicode

Manim: Animation engine for explanatory math videos

Rethinking the Linux cloud stack for confidential VMs

Librebox: An open source, Roblox-compatible game engine

Writing Speed-of-Light Flash Attention for 5090 in CUDA C++

I Made a Floppy Disk from Scratch

Developer's block

WebR – R in the Browser

You can't grow cool-climate plants in hot climates

Lightning declines over shipping lanes following regulation of sulfur emissions

World Wide Lightning Location Network

Shader Academy: Learn computer graphics by solving challenges

450× Faster Joins with Index Condition Pushdown

Show HN: JavaScript-free (X)HTML Includes

David Klein's TWA Posters (2018)

Websites and web developers mostly don't care about client-side problems

The Fancy Rug Dilemma

Nitro: A tiny but flexible init system and process supervisor

The first Media over QUIC CDN: Cloudflare

From M1 MacBook to Arch Linux: A month-long experiment that became permanenent

ArduinoOS (2017)

I run a full Linux desktop in Docker just because I can

My tips for using LLM agents to create software

The theory and practice of selling the Aga cooker (1935) [pdf]

FFmpeg 8.0

Top Secret: Automatically filter sensitive information

The ROI of Exercise

Echidna Enters a New Era of Symbolic Execution

Glyn: Type-safe PubSub and Registry for Gleam actors with distributed clustering

I'm too dumb for Zig's new IO interface

RFC 9839 and Bad Unicode

Manim: Animation engine for explanatory math videos

Rethinking the Linux cloud stack for confidential VMs

Librebox: An open source, Roblox-compatible game engine

Writing Speed-of-Light Flash Attention for 5090 in CUDA C++

I Made a Floppy Disk from Scratch

Developer's block

WebR – R in the Browser

You can't grow cool-climate plants in hot climates

Lightning declines over shipping lanes following regulation of sulfur emissions

World Wide Lightning Location Network

Shader Academy: Learn computer graphics by solving challenges

450× Faster Joins with Index Condition Pushdown

Show HN: JavaScript-free (X)HTML Includes

David Klein's TWA Posters (2018)

Websites and web developers mostly don't care about client-side problems

The Fancy Rug Dilemma

Nitro: A tiny but flexible init system and process supervisor

The first Media over QUIC CDN: Cloudflare

From M1 MacBook to Arch Linux: A month-long experiment that became permanenent

ArduinoOS (2017)

I run a full Linux desktop in Docker just because I can

My tips for using LLM agents to create software

The theory and practice of selling the Aga cooker (1935) [pdf]

FFmpeg 8.0

Top Secret: Automatically filter sensitive information

The ROI of Exercise

Echidna Enters a New Era of Symbolic Execution

Glyn: Type-safe PubSub and Registry for Gleam actors with distributed clustering

I'm too dumb for Zig's new IO interface

Websites and web developers mostly don't care about client-side problems

Comments