This is the v1.5 update for CrawlerCheck. I recently realized how messy the bot landscape has become—distinguishing between "Good SEO Bots," "AI Scrapers" (like GPTBot/Claude), and aggressive tools is a pain.
I built this directory to catalog 150+ active crawlers. For each bot, I've manually verified:
1. The official User-Agent for logs (detection).
2. The official Robots.txt Token (blocking).
(Fun fact: These are often different. e.g., PageSpeed Insights uses 'Chrome-Lighthouse' headers but respects 'Google Page Speed Insights' in robots.txt).
The Tech Stack:
- Backend: Go (custom crawler engine)
- Frontend: SvelteKit (static generation for directory pages)
Features:
- Bulk Generator: You can filter by "AI Bots" -> "Unsafe" and auto-generate a robots.txt snippet.
- Live Test: Simulates the specific User-Agent against your URL to verify 403s vs 200s.
It’s free and I’m currently curating the list manually. Let me know if I missed any major bots or if the safety classifications seem off.
bogozi•1h ago
This is the v1.5 update for CrawlerCheck. I recently realized how messy the bot landscape has become—distinguishing between "Good SEO Bots," "AI Scrapers" (like GPTBot/Claude), and aggressive tools is a pain.
I built this directory to catalog 150+ active crawlers. For each bot, I've manually verified: 1. The official User-Agent for logs (detection). 2. The official Robots.txt Token (blocking). (Fun fact: These are often different. e.g., PageSpeed Insights uses 'Chrome-Lighthouse' headers but respects 'Google Page Speed Insights' in robots.txt).
The Tech Stack: - Backend: Go (custom crawler engine) - Frontend: SvelteKit (static generation for directory pages)
Features: - Bulk Generator: You can filter by "AI Bots" -> "Unsafe" and auto-generate a robots.txt snippet. - Live Test: Simulates the specific User-Agent against your URL to verify 403s vs 200s.
It’s free and I’m currently curating the list manually. Let me know if I missed any major bots or if the safety classifications seem off.