frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: We're tracking AI bot visits daily across our network

2•legitcoders•2h ago
Hi HN,

Since launching LLMS Central (https://llmscentral.com) a few months ago, we're now tracking hundreds of AI bot visits daily across our network. The data is fascinating.

### What We're Seeing

*Daily Bot Traffic (Across Our Network):* - 300-500+ AI bot visits per day - GPTBot (ChatGPT) dominates at ~60% of traffic - Claude, Perplexity, and Google's AI bots make up most of the rest - Peak crawling hours: 2-4 AM UTC (training runs?)

*Real Patterns Emerging:* - Technical documentation gets 5x more AI bot traffic than average content - Blog posts with code examples are crawled 3x more frequently - Sites with llms.txt files see 40% more organized crawling - Most sites have zero visibility into AI bot activity

*Surprising Findings:* 1. AI bots are WAY more active than most people realize 2. They're not just training - they're actively crawling for real-time answers 3. Different bots have different content preferences (Claude likes long-form, Perplexity loves news) 4. Traditional analytics completely miss this traffic

### Technical Details

*Stack:* - Next.js 15 (App Router) - Firebase Firestore for analytics - 2KB tracking script (async, zero perf impact) - Real-time user-agent detection + IP verification

*Bot Detection:* - User-agent parsing (GPTBot, Claude-Web, etc.) - IP range verification (OpenAI, Anthropic, Google) - Behavioral analysis (crawl patterns) - 99%+ accuracy

*Privacy:* - No PII collected - GDPR compliant - Users control data retention - Open source tracking script (coming soon)

### Why I Built This

I noticed my technical blog posts were getting cited by ChatGPT, but Google Analytics showed nothing. Turns out AI bots don't show up in traditional analytics because they're not "users" - they're crawlers.

After manually parsing server logs for weeks, I realized: 1. This should be automated 2. There should be a standard for AI bot permissions (like robots.txt) 3. Sites need visibility into which AI systems are using their content

So I built LLMS Central - both a tracking platform AND a centralized repository for llms.txt files (the proposed standard for AI bot permissions).

### Features

1. *Real-time bot tracking* - See which AI crawlers visit your site 2. *Page-level analytics* - Know which pages AI bots prefer 3. *AEO scoring* - Measure Answer Engine Optimization (like SEO, but for AI) 4. *Multi-engine preview* - See how ChatGPT vs Claude would cite your content 5. *llms.txt generator* - Like robots.txt, but for AI (proposed standard)

### Try It

*Preview tool (no signup):* https://llmscentral.com/aeo-preview

*Full tracking (free tier):* https://llmscentral.com/dashboard

### The Data Keeps Growing

What started as a personal project is now tracking hundreds of domains. Every day we see: - New AI bots appearing (just detected Meta's AI crawler last week) - Crawling patterns evolving (bots are getting smarter about what they crawl) - Sites realizing they have zero visibility into AI usage of their content

The most common reaction: "I had no idea ChatGPT was crawling my site this much."

### Questions

1. Should there be a standard for AI bot permissions (like robots.txt)? We're pushing llms.txt, but curious about alternatives. 2. How should sites monetize AI training data? Or should they? 3. Is "Answer Engine Optimization" (AEO) the future of SEO? 4. What data would YOU want to see about AI bot traffic?

Would love HN's feedback on the technical approach, privacy considerations, and what data would be most valuable to track.

Technology and Jobs

https://paulkrugman.substack.com/p/technology-and-jobs
1•rbanffy•33s ago•0 comments

Show HN: Jotite – A whimsical Linux Markdown note-taking app

https://github.com/maxberggren/jotite
1•maxberggren•7m ago•0 comments

Kirigami parachute suitable for humanitarian missions stabilizes quickly

https://techxplore.com/news/2025-10-kirigami-parachute-suitable-humanitarian-missions.html
1•PaulHoule•7m ago•0 comments

Bitcoin Mesh Networks

https://63sats.com/blog/bitcoin-beyond-the-internet-how-mesh-networks-keep-sats-moving
1•svenfaw•10m ago•0 comments

Should you use Google Docs or Google Forms to collect signature?

https://formesign.com/esign/signature-in-google-docs-vs-google-forms.html
1•QueensGambit•13m ago•0 comments

Plane windshield shatters during United Airlines flight to LAX

https://ktla.com/news/local-news/united-airlines-pilot-injured-as-boeing-windshield-shatters-mid-...
1•Bender•15m ago•0 comments

Building a ~C$40 SlimeVR Tracker on AliExpress from Parts

https://bsky.app/profile/distraction.engineer/post/3m3khfkn5gc27
2•verdverm•16m ago•0 comments

Stop treating scientific code like an afterthought: record, share and value it

https://www.nature.com/articles/d41586-025-03196-0
2•thinkingemote•16m ago•0 comments

The UPS chaos shows tariffs have arrived on our doorsteps

https://www.businessinsider.com/ups-chaos-shows-tariffs-have-finally-arrived-on-our-doorsteps-202...
4•blindriver•17m ago•0 comments

What Pfizer and AstraZeneca's Deals with the Trump Admin Mean for Pharma

https://www.forbes.com/sites/ritanumerof/2025/10/17/what-pfizer-and-astrazenecas-deals-with-the-t...
2•Brysonbw•17m ago•0 comments

We Need Arabic Language Models

https://www.natureasia.com/en/nmiddleeast/article/10.1038/nmiddleeast.2025.142
8•thinkingemote•18m ago•1 comments

Show HN: I built an AI app that detects and analyzes food. It's called WTF

https://whatthefood.io
1•Odeh13•20m ago•2 comments

Ratatui.rs Running on Amazon Kindle

https://bsky.app/profile/orhun.dev/post/3m3evyg2apc2w
1•pythops•20m ago•0 comments

Show HN: ActionsGuardHub – A tool for analyzing Malicious GitHub Actions

https://github.com/suchithnarayan/actions-guard-hub
1•suchithnarayan•21m ago•0 comments

Better-auth account takeover (CVE-2025-61928) found via ZeroPath

https://zeropath.com/blog/breaking-authentication-unauthenticated-api-key-creation-in-better-auth...
2•etlun•24m ago•1 comments

What Unix pipelines got right and how we can do better

https://programmingsimplicity.substack.com/p/what-unix-pipelines-got-right-and
12•rajiv_abraham•29m ago•6 comments

Wanna Buy a Datacenter Cheap?

https://lowendbox.com/blog/wanna-buy-a-datacenter-cheap/
1•indigodaddy•29m ago•0 comments

In Systems Design, Perfection Is the Enemy of the Good Enough

https://magarshak.com/blog/?p=587
1•EGreg•29m ago•0 comments

How Elon Musk Ruined Twitter

https://jacobin.com/2025/10/enshittification-doctorow-musk-twitter-internet
5•sebastian_z•30m ago•0 comments

Life Happiness Index: 30 factors that determine wanting to exist

https://www.lifehappinessindex.org/
1•mrconter11•34m ago•1 comments

Vibe Coding? Straight to Jail

https://medium.com/@alex_30979/vibe-coding-straight-to-jail-a933c4fa52f9
2•byte0•38m ago•2 comments

The White House is already one of the most blocked accounts on Bluesky

https://techcrunch.com/2025/10/19/the-white-house-is-already-one-of-the-most-blocked-accounts-on-...
6•dxs•39m ago•1 comments

Backname.io

https://github.com/Twixes/backname
2•Twixes•41m ago•0 comments

Dosbian: Boot to DOSBox on Raspberry Pi

https://cmaiolino.wordpress.com/dosbian/
22•indigodaddy•42m ago•3 comments

Cheapest ARM Debugger is RISC-V

https://bogdanthegeek.github.io/blog/projects/v003-dap/
3•BogdanTheGeek•42m ago•0 comments

Naming code, the value-identity relation

https://tangrammer.codeberg.page/on-the-clojure-move/output/posts/naming-code.html
1•tangrammer•44m ago•0 comments

What the Books Get Wrong about AI [Double Descent] (Welch Labs) [video]

https://www.youtube.com/watch?v=z64a7USuGX0
2•ks2048•46m ago•0 comments

SpaceX's Starship still missing orbit, refueling, landing

https://www.theregister.com/2025/10/16/spacexs_starship_two_down_a/
1•belter•48m ago•0 comments

Start by Not Being a Terrible Software Engineer

https://caponte.io/2025/10/19/Start-By-Not-Being-Terrible/
1•0xCaponte•50m ago•1 comments

US Government Uptime Monitor

https://usa-status.com/
89•exr0n•52m ago•17 comments