It is a digression, but I imagine many others are facing similar issues.
> The main IAPSOP server is being overrun by unknown crawlers running on IP addresses controlled by Amazon Web Services (AWS), crawlers with IP addresses in the People's Republic of China, and other miscreants...
I've blocked most of Amazon, Alibaba Cloud and other cloud ASNs. Facebook's page preview crawler API was another abuser. There are also several problematic Chinese ISPs. You'll identify those networks from the outdated and impossible generated user agents. As I have no customers in those regions, it seems obvious to block the entire ASN.
In addition, the common User-Agent filters should be employed. You can drop ASNs when they hit an excessive number of 403s, are from a cloud provider or are in a problematic region.
NoMoreNicksLeft•38m ago
Oh wow, I've been downloading magazines for the last few months, always good to find more. luminist.org has been kicking my ass the last few weeks, but almost done and I can move on to these.
palmfacehn•1h ago
> The main IAPSOP server is being overrun by unknown crawlers running on IP addresses controlled by Amazon Web Services (AWS), crawlers with IP addresses in the People's Republic of China, and other miscreants...
I've blocked most of Amazon, Alibaba Cloud and other cloud ASNs. Facebook's page preview crawler API was another abuser. There are also several problematic Chinese ISPs. You'll identify those networks from the outdated and impossible generated user agents. As I have no customers in those regions, it seems obvious to block the entire ASN.
In addition, the common User-Agent filters should be employed. You can drop ASNs when they hit an excessive number of 403s, are from a cloud provider or are in a problematic region.