frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Scraping Shock: Why Web Data Is Getting Too Expensive to Scrape

https://scrapeops.io/blog/scraping-shock/
4•Ian_Kerins•1h ago

Comments

Ian_Kerins•1h ago
One of the main ideas, we explored here is how scraping has shifted from being mainly a technical challenge to an economic one:

- Infrastructure and proxies have gotten cheaper, but anti-bot defenses have evolved fast.

- Because of that, the real cost of scraping is now the cost per successful result, and spikes of 5x–20x can happen when defenses tighten.

- The bottleneck today isn’t just “can you scrape it?”, it’s whether you can do it profitably and efficiently.

I’d love to hear how folks here are dealing with rising scraping costs or what strategies have worked when data value doesn’t obviously outweigh defense costs.

joe_91•1h ago
Nice concept. I've definitely seen this play out in practice.

A lot of sites aren't impossible to scrape, but they're steadily getting more expensive. We're having to lean more on residential proxies, headless browsers etc just to get the same data that used to be straightforward...

fidansin•1h ago
I'm not fully convinced scraping has actually gotten harder.. It feels more like the average approach has gotten softer.

Lately everything gets framed as rising costs or unstoppable anti-bot systems, but most sites didn't suddenly become impenetrable. What changed is how people react to friction.

We're in an AI-autopilot phase now. Hit a block and the instinct is to buy more credits, switch vendors,, or let an API abstract the problem away. Meanwhile, teams still doing basic engineering work around sessions, behavior, pacing, and retries are often scraping the same targets just fine.

Honest question: have scraping costs really exploded, or have engineering standards quietly dropped as abstraction layers piled up?

Ian_Kerins•1h ago
Interesting take on it. Some people probably wouldn't like to be called soft but there is likely some truth to it.

I feel it really comes down to priorities.

Scraping has always been a means to a end for most companies. Get data and then use it for something valuable. Before getting the data was easy, but now it is getting increasingly harder.

I think the key here is highlighting the fact that the time of cheap/easy/low skilled access to web data is ending. Companies either need to skill up on understanding how to bypass anti-bots or pay someone else to do it for them and they focus on the data.

fidansin•1h ago
I just worry we're collapsing two things into one bucket: harder in absolute terms vs harder relative to how much real engineering effort teams are willing to invest.

Those aren't the same, and to me the distinction matters.

bediger4000•56m ago
Ethically dubious article. Treats using "residential proxies", which are probably installed by some kind of cybercriminal, as a legitimate thing to do. Similarly, treats circumventing anti-scraping measures as a legitimate thing to do. They aren't. Take the hint, ignore web sites with some kind of anti-bot, or anti-scraper system. Ignore web sites with a scraper junkyard. Those people don't want you to have their content.

When a website upgrades its anti-bot system, it doesn't just make scraping slightly harder. It can make it 5X, 10X, or even 50X more expensive overnight.

This, of course, is very good news. Keep up the good work, folks!

joe_91•42m ago
Tell that to the thousands of apps/sides out there which rely on scraped data ;) (Including all search engines/LLMs/price comparison sites etc)
lucas_camargo•53m ago
Good article! The cost-per-success metric really is the overlooked part

Welcome to the American Winter

https://www.theatlantic.com/politics/2026/01/minneapolis-uprising/685755/
1•empath75•36s ago•0 comments

The Danger of a Single Capital Letter: How I Almost Ruined a Redmine Instance

https://blog.devbert.de/the-danger-of-a-single-capital-letter/
1•preezer•52s ago•0 comments

Brex and the Pros and Cons of Hubristic Fundraising

https://www.saastr.com/brex-and-the-pros-and-cons-of-hubristic-fundraising/
1•wslh•2m ago•0 comments

Qwen3-Max-Thinking

https://qwen.ai/blog?id=qwen3-max-thinking
5•vinhnx•3m ago•0 comments

Building Brains on a Computer

https://www.asimov.press/p/brains
1•mailyk•3m ago•0 comments

Is the US Supreme Court Biased Towards the Rich?

https://www.nominalnews.com/p/is-the-us-supreme-court-bias-wealthy
2•MasPL•3m ago•1 comments

Notes on German Exit Tax from Paid Tax Advisor Calls

https://wegzugsteuer.info/en
1•olieidel•4m ago•0 comments

My vibe engineering process and stack

https://aimode.substack.com/p/my-vibe-engineering-process-and-stack
1•warthog•6m ago•0 comments

What We Can't Control (2016)

https://solomon.io/what-we-cant-control/
1•samsolomon•7m ago•0 comments

Competitive Pure Functional Languages

https://blog.samibadawi.com/2026/01/competitive-pure-functional-languages.html
1•type-lambda•7m ago•0 comments

Technology is changing how we write – and how we think about writing

https://www.nature.com/articles/d41586-026-00245-0
1•geox•9m ago•1 comments

Mods, when will you get on top of the constant AI slop posts?

https://old.reddit.com/r/programming
1•birdculture•10m ago•0 comments

Show HN: I built a tool for automated failure analysis in GitHub Actions

https://github.com/marketplace/actions/github-actions-failure-analysis
1•calebevans•12m ago•0 comments

Linear-Term: A TUI for Linear Project Management

https://github.com/tjburch/linear-term
1•tjburch•12m ago•0 comments

The Age of Impoliteness: Galateo: Or, a Treatise on Politeness (1774)

https://publicdomainreview.org/collection/galateo/
1•Anon84•13m ago•0 comments

Novel biosensor enables real-time tracking of iron (II) in living cells

https://pubs.acs.org/doi/10.1021/acssensors.5c02481
1•bookofjoe•13m ago•0 comments

Go tests probably don't need a mocking library

https://rednafi.com/go/mocking-libraries-bleh/
1•AlexeyBelov•14m ago•1 comments

Saudi Arabia ordered to pay £3M to London dissident over Pegasus spying

https://www.theguardian.com/world/2026/jan/26/saudi-arabia-ordered-pay-london-dissident-pegasus-s...
2•chrisjj•15m ago•1 comments

Cop-assisted extortion of DWI arrestees in New Mexico include getting them drunk

https://reason.com/2026/01/23/cop-assisted-extortion-of-dwi-arrestees-in-new-mexico-included-gett...
16•leephillips•16m ago•0 comments

Common Sense

https://en.wikipedia.org/wiki/Common_Sense
1•emmabruns•17m ago•0 comments

ICE follows starkly different playbooks in red and blue states

https://www.cnn.com/2025/08/05/us/immigration-arrests-community-ice-invs
7•rbanffy•18m ago•3 comments

Windows 11's Patch Tuesday nightmare gets worse

https://www.windowscentral.com/microsoft/windows-11/windows-11s-botched-patch-tuesday-update-nigh...
13•01-_-•18m ago•2 comments

ESA Meerkat Asteroid Guard: a monitoring service for imminent impactors

https://arxiv.org/abs/2601.13323
1•belter•19m ago•0 comments

Ask HN: How do you balance clarity vs speed in communication?

1•simon-rebbins•20m ago•1 comments

OracleGPT: Thought Experiment on an AI Powered Executive

https://senteguard.com/blog/#post-7fYcaQrAcfsldmSb7zVM
2•djwide•20m ago•1 comments

An Afghan entrepreneur turned sanctions into a $60M-per-month aid platform

https://thebitgazette.com/afghanistans-quiet-crypto-breakthrough-in-aid/
3•campusninja•20m ago•0 comments

Women's skepticism toward AI: The role of risk orientation and risk exposure

https://academic.oup.com/pnasnexus/article/5/1/pgaf399/8429563?login=false
3•PaulHoule•21m ago•0 comments

AI will not replace software engineers (hopefully)

https://medium.com/@sig.segv/ai-will-not-replace-software-engineers-hopefully-84c4f8fc94c0
5•fwef64•21m ago•1 comments

The Essence of Frigidity

https://computer.rip/2026-01-25-the-essence-of-frigidity.html
1•Brajeshwar•22m ago•0 comments

Graphic tip: rounded rectangle borders (2006)

https://web.archive.org/web/20110302165328/http://www.artofadambetts.com/weblog/2006/08/graphic-t...
1•lelandfe•23m ago•0 comments