#OpenStreetMap hammered by scrapers hiding behind residential proxy/embedded-SDK networks.
direwolf20•1h ago
More like hammered by Google and Apple so you'll use their apps instead.
Bender•1h ago
Looks like it is hosted in Equinix in NL? Or just part of it maybe? Is it behind a load balancer, maybe something like HAProxy? If so were stick tables set up to limit rates by cookie and require people be logged in on unique accounts and limit anonymous access after so many requests? I know limiting anonymous access is not great but that is something that could be enabled when under a high load so that instead of the site going offline for everyone it would just be limited for the anonymous users. Degradation vs critical outage
On a separate note have tcpdump captures been done on these excessive connections? Minus the IP, what do their SYN packets look like? Minus the IP what do the corresponding log entries look like in the web server? Are they using HTTP/1.1 or HTTP/2.0? Are they missing any expected headers for a real person such as cors, no-cors, navigate, accept_language?
tcpdump -p --dont-verify-checksums -i any -NNnnvvv -B32768 -c32 -s0 port 443 and 'tcp[13] == 2'
Is there someone at OpenStreetMap that can answer these questions?
phillipseamore•1h ago
The number of idiotic vibe coded repos I've seen on GH lately that are doing things like crawling OSM for POI data is mindboggling!
molly_radstowe•1h ago
direwolf20•1h ago
Bender•1h ago
On a separate note have tcpdump captures been done on these excessive connections? Minus the IP, what do their SYN packets look like? Minus the IP what do the corresponding log entries look like in the web server? Are they using HTTP/1.1 or HTTP/2.0? Are they missing any expected headers for a real person such as cors, no-cors, navigate, accept_language?
Is there someone at OpenStreetMap that can answer these questions?