frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Browser extensions turn nearly 1M browsers into website-scraping bots

https://arstechnica.com/security/2025/07/browser-extensions-turn-nearly-1-million-browsers-into-website-scraping-bots/
25•chha•8h ago

Comments

paulryanrogers•6h ago
Extensions and VPNs have been doing this for years, it's not a secret. Where I worked we paid a proxy/scraping company that also offered 'stealth' scraping using residential IPs. They got those IPs using techniques like these extensions.

Chrome web store changed its policy years ago to prohibit these with the rationale that an extension should have a single purpose. Apparently their scanning tools aren't enforcing the policy strictly enough.

mmsc•6h ago
Indeed, it's not a secret and it's not just extensions and VPNs, but everything you could imagine. Lots of applications that advertise themselves as "ways to make money for your unused internet bandwidth" are available which do this -- openly.

This type of software is bundled into system executables as well - just like the "free antivirus and browser toolbars" of yesterday, these are the new bundled software.

If a company has an "internal network" (lol) that consists of security that can be described as Swiss cheese, then this stuff is a massive gap there.

josephg•6h ago
> Extensions and VPNs have been doing this for years, it's not a secret.

Its not a secret in the industry, but I bet money that most of your users have no idea this is happening. They almost certainly wouldn't install those web extensions if this information was widely known.

As a rule of thumb, if you need to do something in secret to get away with it, its probably not ethical.

paulryanrogers•3h ago
It's supposed to be in the terms of service. Otherwise it is indeed fraud/abuse. Though I'd agree that most users don't read the fine print.
nerdjon•6h ago
I have to wonder, how long until the browsers just natively do this.

Gets around the AI blockers that CloudFlare is pushing with the added benefit of seeing information that a crawler would never see.

Just hide it behind an "AI Browser" that just sends everything your browser sees to the cloud anyways for processing...

Throw in some vague "privacy" promise for good measure.

(I realize this is being more sneaky and doing stuff in the background, but my question remains)

Cthulhu_•6h ago
This may already be happening to a point; I forgot what it's called but in Chrome you can opt-in to sharing analytical data, which is used by Google's page speed insights tooling and/or Lighthouse to measure your site's performance by a wide range of devices and internet connections.
xnx•6h ago
I'd be OK with an open reciprocal crawling network for non-personal/private pages as it would be a distributed force against walled gardens.

I'm very against this being done surreptitiously/deceptively and on private content (emails, chats, etc.)

mdaniel•3h ago
I ran an extension that automatically submitted pages to the Internet Archive as I browsed them, but managing the allowlist/denylist turned into a major hassle, so I eventually just installed the extension into a "public browsing" profile, but as is often the case it turned into "I don't feel like switching to that profile" and it fell by the wayside

But, in the same vein as your comment, I have long wished for Common Crawl to really lean into their mission, and not just publish monthly snaps of whatever their bots can see but do what you said and accept .har or .warc files from anyone and serve the ... hourly? ... .warc via Bittorrent

riedel•5h ago
I wonder why nothing like F-Droid did ever take off for browser extensions. Even if tons of stuff is open source, the standard distribution format are zip files with unknown content. And browser vendors never lived up to their promise that they even checked the most basic things. Also the whole manifest mess is rather a means to secure ad revenue and not to protect users.
mdaniel•3h ago
I can think of 2 pragmatic reasons:

1. If one wished to use .xpi/.crx (akin to F-Droid's install pathway) then the user would have to teach the browser to trust the signature of them. F-Droid doesn't suffer from this because each .apk is self-trusting, meaning it is signed, and that signature conveys lineage (v1.0 is owned by the same publisher as v1.1, so safe to upgrade), but the operating system doesn't have to be informed about any chain of custody for the .apk cert

2. I am not aware of any self-hosting extension registry, even from Mozilla, and extra lol for Chromium. If such a thing existed, the browser would have to allow the user to add "trusted extension registries" (along with their trusted CA chain). It would actually be snazzy if they went the Helm/Homebrew route and just leveraged OCI distribution (aka docker registry) for that, since it would open up almost unlimited self-hosting options, including publishing right from GitHub Actions to ghcr.io

riedel•2h ago
IMHO it would be rather easy to overcome this by forking. I anyways have used forks like librewolf, betterbird and recently Zen for Mozilla stuff due to all this telemetry (I guess you will need not care about malware if the browser already contains so many trackers)
mdaniel•3h ago
I'm shocked that command-f "honey" didn't return any hits

Tesla built a quadrillion-scale observability platform on ClickHouse

https://clickhouse.com/blog/how-tesla-built-quadrillion-scale-observability-platform-on-clickhouse
1•Ziadm•40s ago•0 comments

Cashless societies are deeply impersonal [video]

https://www.youtube.com/watch?v=De2lH7w6r_o
1•saltysalt•1m ago•0 comments

Belkin abandons support for Wemo devices

https://manage.kmail-lists.com/subscriptions/web-view?a=R7Hguj&c=01JZTP6HVJN7C0538VTZ17VKFC&k=949edb069f63e333baf3b8a674fb5eed&g=Y7Tbn5&m=RkSqQ9&r=01JZTS0FE6H0FVTWAHT5YJ2V9D&e=01JZTP6HVJN7C0538VTZ17VKFC
1•mikecarlton•3m ago•0 comments

Visual Studio Code 1.102

https://code.visualstudio.com/updates/v1_102
1•tosh•3m ago•0 comments

Intel's CEO: 'We are not in the top 10' of leading chip companies

https://www.oregonlive.com/silicon-forest/2025/07/intels-ceo-we-are-not-in-the-top-10-of-leading-chip-companies.html
1•layer8•5m ago•0 comments

eBPF: Connecting with Container Runtimes

https://h0x0er.github.io/blog/2025/06/29/ebpf-connecting-with-container-runtimes/
2•forxtrot•8m ago•0 comments

Please don't cut funds for space traffic control, industry begs Congress

https://www.theregister.com/2025/07/10/space_traffic_control_congress/
1•rntn•9m ago•0 comments

The General-Purpose AI Code of Practice

https://digital-strategy.ec.europa.eu/en/policies/contents-code-gpai
1•TeMPOraL•13m ago•0 comments

I Changed My Mind: AI Will Replace Us

https://medium.com/@danthelion/i-changed-my-mind-ai-will-replace-us-8b7460f2a233
2•danthelion•14m ago•0 comments

Native forests sink more carbon than expected, inverse modeling reveals

https://phys.org/news/2025-06-native-forests-carbon-inverse-reveals.html
1•PaulHoule•15m ago•0 comments

Tech Workers Take Much Lower Pay to Ditch the Office

https://anderson-review.ucla.edu/tech-workers-take-much-lower-pay-to-ditch-the-office/
1•zuhayeer•16m ago•0 comments

Show HN: Learn times tables and print them

https://times-tables.org/
1•artiomyak•17m ago•0 comments

Show HN: Activiews – A privacy-first fitness alternative for Apple users

http://activiews.xyz/
2•ahmetomer•19m ago•0 comments

AI coding may not be helping as much as you think

https://garymarcus.substack.com/p/breaking-news-ai-coding-may-not-be
1•tosh•19m ago•0 comments

US–Iran Relations: 1953

https://daily.jstor.org/us-iran-relations-1953/
3•lr0•19m ago•0 comments

New AI Training for Teachers?

https://www.nytimes.com/2025/07/08/technology/chatgpt-teachers-openai-microsoft.html
1•paulpauper•19m ago•0 comments

My Excellent Conversation with David Robertson

https://marginalrevolution.com/marginalrevolution/2025/07/my-excellent-conversation-with-david-robertson.html
2•paulpauper•20m ago•0 comments

CommaCarSegments: 3148 hours of raw CAN bus data from 230 different car models

https://huggingface.co/datasets/commaai/commaCarSegments
2•LorenDB•24m ago•0 comments

Show HN: Play Sand Blast Block Puzzle Online

https://sandblastblockpuzzle.io
1•loocao•26m ago•1 comments

Amazon warehouse workers lose jobs after Trump's immigration crackdown

https://www.cnbc.com/2025/07/10/amazon-warehouse-workers-lose-jobs-after-trump-immigration-crackdown.html
1•donsupreme•27m ago•0 comments

Tech Recession Over: The Return of Novelty Work

2•tsunamifury•29m ago•1 comments

Pipes May Be All You Need

https://www.s-anand.net/blog/pipes-may-be-all-you-need/
1•speckx•30m ago•0 comments

These Toads Have Psychedelic Powers, but They'd Prefer to Keep It Quiet

https://www.nytimes.com/2025/07/10/climate/psychedelic-sonoran-desert-toad.html
1•mistersquid•30m ago•0 comments

A Test for AI Consciousness [pdf]

https://ecorner.stanford.edu/wp-content/uploads/sites/2/2023/04/a-test-for-ai-consciousness-transcript-2.pdf
2•handfuloflight•35m ago•0 comments

Ask HN: How can I invest in Solar Power?

1•idontwantthis•36m ago•2 comments

We Bundled and Saved 50% on Cold Starts of Our TypeScript SDK

https://dagger.io/blog/typescript-sdk-performance
1•gk1•37m ago•0 comments

Wells Fargo Plans to Exit Bilt Partnership

https://www.wsj.com/finance/banking/wells-fargo-plans-to-exit-a-credit-card-program-that-gave-rewards-for-rent-336dae4b
1•bdev12345•38m ago•1 comments

Lab-grown sperm and eggs just a few years away, scientists say

https://www.theguardian.com/science/2025/jul/05/lab-grown-sperm-and-eggs-scientists-reproduction
2•walterbell•40m ago•0 comments

Paradise Papers Shine Light on Where the Elite Keep Their Money (2017)

https://www.nytimes.com/2017/11/05/world/paradise-papers.html
1•sandwichsphinx•40m ago•0 comments

Measuring the Impact of AI on Experienced Open-Source Developer Productivity [pdf]

https://metr.org/Early_2025_AI_Experienced_OS_Devs_Study.pdf
1•ColinEberhardt•41m ago•1 comments