Anatomy of a Domain Risk Engine: Why Regex Isn't Enough (And LLMs Are Too Slow)
Building a URL scanner in 2025 is an exercise in managing trade-offs.
If you rely solely on traditional methods like Regex and Blocklists, you miss the sophisticated attacks (False Negatives). If you send every single URL to a massive, general-purpose Large Language Model (LLM), you will go bankrupt—the token costs simply don't scale for high-volume scanning.
tomerhe•2h ago
Building a URL scanner in 2025 is an exercise in managing trade-offs.
If you rely solely on traditional methods like Regex and Blocklists, you miss the sophisticated attacks (False Negatives). If you send every single URL to a massive, general-purpose Large Language Model (LLM), you will go bankrupt—the token costs simply don't scale for high-volume scanning.