frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Scraping via Googlebot – How is it possible?

2•devx_•21m ago
Hi,

I run a website that recently experienced unusually high traffic from what appeared to be legitimate Googlebot. After investigating the access patterns, I was able to identify the source through some creative analysis.

Background

Someone has been scraping my website extensively using what appears to be authentic Googlebot. I traced the activity back to the person responsible, and they revealed they're using a commercial API service that can trigger real Googlebot crawls on-demand.

Technical Details

I tested the service myself to verify their claims, and confirmed it does indeed dispatch legitimate Googlebot to any URL within 1–2 seconds.

Verified Googlebot IPs (via reverse DNS):

- 66.249.76.65 → crawl-66-249-76-65.googlebot.com

- 192.178.4.87 → crawl-192-178-4-87.googlebot.com

- 2001:4860:4801:002d::0006 → crawl-2001-4860-4801-002d...googlebot.com

- Additional IPs from 34.96.x.x range → googleusercontent.com

Request Headers:

- User-Agent: Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

- From: googlebot(at)googlebot.com

- Referer: https://www.google.com/

What Makes This Unusual:

- The service returns scraped HTML within 1–2 seconds

- It works for completely fresh URLs that have never been crawled

- All reverse DNS lookups confirm legitimate Google infrastructure

- The requests are triggered on-demand via API call

Verification Offer

I'm happy to validate these claims by having the service trigger a crawl to a unique test URL, so you can verify in your internal logs that it's genuinely Googlebot being dispatched.

Any insights into how this is technically possible?

Thanks!

AI is coming for the world of competitive Excel

https://thehustle.co/originals/ai-is-coming-for-the-world-of-competitive-excel
1•Anon84•2m ago•0 comments

Where Are the Builders?

https://near.blog/where-are-the-builders/
1•0x79de•2m ago•0 comments

Layanan BCA-Mobile Terblokir hubungi 0813-707-1535

1•mdiezb•3m ago•0 comments

Reflexio, a retry library with backoff strategies on per-error-class basis

https://github.com/aponysus/reflexio
1•aponysus•6m ago•1 comments

The Strange and Totally Real Plan to Blot Out the Sun

https://www.politico.com/news/magazine/2025/11/21/stardust-geoengineering-janos-pasztor-regulatio...
1•domofutu•6m ago•0 comments

Politicians Need to Understand This Computer Science Concept Better (2016)

https://nautil.us/politicians-need-to-understand-this-computer-science-concept-better-236046/
1•furcyd•7m ago•0 comments

Layanan Buka Blokir BCA 0813-707-1535

1•rawlinz•7m ago•0 comments

Buka Blokir ATM BCA

1•rawlinz•8m ago•0 comments

Seeking Founding Engineer (CTO Track) – Secure Digital Communications – UK Only

1•ProductMngrUK•9m ago•0 comments

Show HN: Tab Freezer – Saved 3.1GB swap with 84 tabs open

2•tech_builder_42•11m ago•0 comments

Getting Through Big, Dense, Difficult Books

https://www.nytimes.com/2025/11/18/magazine/long-difficult-books-clubs.html
1•lxm•12m ago•0 comments

KeyTips now available in Office for Mac (Windows alt-shortcuts) (2024)

https://techcommunity.microsoft.com/blog/microsoft365insiderblog/keytips-now-available-in-office-...
1•eisa01•12m ago•1 comments

Taking ASCII Drawings Seriously: How Programmers Diagram Code

https://dl.acm.org/doi/10.1145/3613904.3642683
1•andsoitis•18m ago•0 comments

Adventures in Fake Neuralese

https://justismills.substack.com/p/adventures-in-fake-neuralese
1•surprisetalk•19m ago•0 comments

Sci-Fi Story: "Happy Aliens"

https://psychotechnology.substack.com/p/sci-fi-story-happy-aliens-1830
1•surprisetalk•20m ago•0 comments

Designing a Mechanical Calculator

https://signoregalilei.com/2025/11/22/designing-a-mechanical-calculator/
1•surprisetalk•20m ago•0 comments

Hamming Questions

https://bestjelly.substack.com/p/hamming-questions
1•surprisetalk•20m ago•0 comments

Scraping via Googlebot – How is it possible?

2•devx_•21m ago•0 comments

Silent cyber threats: shadow AI could undermine Canada's digital health defenses

https://medicalxpress.com/news/2025-11-silent-cyber-threats-shadow-ai.html
2•PaulHoule•22m ago•0 comments

Show HN: AwardLocker – Real-time award flight search

https://www.awardlocker.com/
1•__cxa_throw•23m ago•0 comments

Enough with the sales hype: there is nothing special about sales

https://greyenlightenment.com/2025/10/19/enough-with-the-sales-hype-there-is-nothing-special-abou...
1•paulpauper•24m ago•0 comments

AI trained on bacterial genomes produces never-before-seen proteins

https://arstechnica.com/science/2025/11/generative-ai-meets-the-genome/
3•ulrischa•24m ago•0 comments

The Fate of Data Model Dependency

https://medium.com/@HobokenDays/the-fate-of-shared-data-model-cf8a3dc88ac9
1•HideInNews•27m ago•0 comments

Bill Kroyer – Animation Director

https://www.mrinbetween.org
1•exvi•29m ago•0 comments

Show HN: I just fixed .env once and for all – better-env

https://better-env.dev
5•harish3304•30m ago•2 comments

Mr. In-Between: My Life in the Middle of the Animation Revolution

https://www.awn.com/animationworld/bill-kroyer-talks-his-new-book-mr-inbetween-my-life-middle-ani...
2•exvi•30m ago•0 comments

TikTok tests feature that will let users request to 'see less' AI generated slop

https://www.pcgamer.com/hardware/inundated-with-slop-tiktok-tests-feature-that-will-let-users-req...
2•CharlesW•31m ago•0 comments

Show HN: Built a tool solve the nightmare of chunking tables in PDF vs. Markdown

https://github.com/2dogsandanerd/smart-ingest-kit
2•2dogsanerd•38m ago•0 comments

Walt Disney, LVII: His groove! The rhythm by which he lives his life

http://www.kinemalogue.net/2024/12/walt-disney-part-lvii-his-groove-rhythm.html
1•exvi•41m ago•0 comments

Show HN: NFOArchive. A modern, retro-styled NFO file archive and viewer

https://nfoarchive.com/
1•bilekas•44m ago•0 comments