So I decided to build a probe script: a Chrome-controlled browser (matching the fingerprint of a real CUA setup) probing blocking mechanisms of URLs on the internet and tagging each one green/red based on whether the page loads or which WAF vendor stepped in (in some cases a CUA harness is used to get better precision). The result is guestlist - a Python API that lets you check if a URL is open to interact it for you CUA before sending your agent to it.
There are of course tools in this space that help you mask your agent to make it look more "human" which work well, but I think that having a tool like this in addition to them can help with consistency on large scale projects.