I reverse-engineered "Direct" traffic and found 40% was AI bots

https://www.Zyro.world/

2•edwardglush•1w ago

Comments

edwardglush•1w ago

Hello HN,

I'm the technical founder of Zyro. I built this because I kept seeing my "Direct" traffic bucket inflate while my ad spend efficiency dropped. I suspected the traffic wasn't actually "Direct," but standard analytics tools (GA4) were stripping the referrer data from newer sources.

I spent the last year building a custom detection engine to "unmask" this traffic.

The Tech Stack & Implementation:

The Core Problem: Most AI tools (ChatGPT, Perplexity, Claude) and "dark social" apps don't pass standard referrer headers.

The Solution: I built a TrafficSourceDetector that parses over 50 specific tracking parameters (like ttclid, gbraid, and specific AI signatures) that usually get sanitized by default configs.

Data Handling: One major headache was URL truncation with standard string columns. I migrated the schema to use NVARCHAR(MAX) in SQL Server to handle the massive, parameter-heavy URLs generated by modern ad platforms without data loss.

Optimization Logic: Instead of frequentist A/B testing (which wastes traffic on losers), I implemented a Multi-Armed Bandit (Thompson Sampling) algorithm. It updates in real-time to route traffic to the winning variant automatically.

Latency: To ensure the "flicker" doesn't hurt UX, I moved the geolocation logic (MaxMind) to a localized instance (avoiding external API calls) to keep decision latency near 0ms.

Scanning: The visual editor uses HtmlAgilityPack to parse the DOM and identify "testable" elements (headlines, buttons) automatically.

The tool is currently live. I'm looking for feedback on the detection logic—specifically if anyone else is seeing massive "Direct" traffic that turns out to be AI scrapers/users.

Happy to answer questions about the bandit algorithm or the SQL architecture!

topak3000•1w ago

How do you determine AI bots traffics if there is no user-agent?

Show HN: PaySentry – Open-source control plane for AI agent payments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

The Crumbling Workflow Moat: Aggregation Theory's Final Chapter

Pax Historia – User and AI powered gaming platform

Show HN: I built a RAG engine to search Singaporean laws

Scams, Fraud, and Fake Apps: How to Protect Your Money in a Mobile-First Economy

Porting Doom to My WebAssembly VM

Cognitive Style and Visual Attention in Multimodal Museum Exhibitions

Full-Blown Cross-Assembler in a Bash Script

Logic Puzzles: Why the Liar Is the Helpful One

Optical Combs Help Radio Telescopes Work Together

Show HN: Myanon – fast, deterministic MySQL dump anonymizer

The Tao of Programming

Forcing Rust: How Big Tech Lobbied the Government into a Language Mandate

PanelBench: We evaluated Cursor's Visual Editor on 89 test cases. 43 fail

Can You Draw Every Flag in PowerPoint? (Part 2) [video]

Show HN: MCP-baepsae – MCP server for iOS Simulator automation

Make Trust Irrelevant: A Gamer's Take on Agentic AI Safety

Show HN: Sem – Semantic diffs and patches for Git

Hello world does not compile

Show HN: ZigZag – A Bubble Tea-Inspired TUI Framework for Zig

Metaphor+Metonymy: "To love that well which thou must leave ere long"(Sonnet73)

Show HN: Django N+1 Queries Checker

Emacs-tramp-RPC: High-performance TRAMP back end using JSON-RPC instead of shell

Protocol Validation with Affine MPST in Rust

Female Asian Elephant Calf Born at the Smithsonian National Zoo

Show HN: Zest – A hands-on simulator for Staff+ system design scenarios

Show HN: DeSync – Decentralized Economic Realm with Blockchain-Based Governance

Automatic Programming Returns

Why Are There Still So Many Jobs? The History and Future of Workplace Automation [pdf]