frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Near-Instantly Aborting the Worst Pain Imaginable with Psychedelics

https://psychotechnology.substack.com/p/near-instantly-aborting-the-worst
1•eatitraw•4m ago•0 comments

Show HN: Nginx-defender – realtime abuse blocking for Nginx

https://github.com/Anipaleja/nginx-defender
2•anipaleja•4m ago•0 comments

The Super Sharp Blade

https://netzhansa.com/the-super-sharp-blade/
1•robin_reala•5m ago•0 comments

Smart Homes Are Terrible

https://www.theatlantic.com/ideas/2026/02/smart-homes-technology/685867/
1•tusslewake•7m ago•0 comments

What I haven't figured out

https://macwright.com/2026/01/29/what-i-havent-figured-out
1•stevekrouse•8m ago•0 comments

KPMG pressed its auditor to pass on AI cost savings

https://www.irishtimes.com/business/2026/02/06/kpmg-pressed-its-auditor-to-pass-on-ai-cost-savings/
1•cainxinth•8m ago•0 comments

Open-source Claude skill that optimizes Hinge profiles. Pretty well.

https://twitter.com/b1rdmania/status/2020155122181869666
2•birdmania•8m ago•1 comments

First Proof

https://arxiv.org/abs/2602.05192
2•samasblack•10m ago•1 comments

I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

https://mohammedeabdelaziz.github.io/articles/trendscope-market-scanner
1•mohammede•11m ago•0 comments

Kagi Translate

https://translate.kagi.com
2•microflash•12m ago•0 comments

Building Interactive C/C++ workflows in Jupyter through Clang-REPL [video]

https://fosdem.org/2026/schedule/event/QX3RPH-building_interactive_cc_workflows_in_jupyter_throug...
1•stabbles•13m ago•0 comments

Tactical tornado is the new default

https://olano.dev/blog/tactical-tornado/
2•facundo_olano•15m ago•0 comments

Full-Circle Test-Driven Firmware Development with OpenClaw

https://blog.adafruit.com/2026/02/07/full-circle-test-driven-firmware-development-with-openclaw/
1•ptorrone•15m ago•0 comments

Automating Myself Out of My Job – Part 2

https://blog.dsa.club/automation-series/automating-myself-out-of-my-job-part-2/
1•funnyfoobar•15m ago•0 comments

Google staff call for firm to cut ties with ICE

https://www.bbc.com/news/articles/cvgjg98vmzjo
39•tartoran•16m ago•5 comments

Dependency Resolution Methods

https://nesbitt.io/2026/02/06/dependency-resolution-methods.html
1•zdw•16m ago•0 comments

Crypto firm apologises for sending Bitcoin users $40B by mistake

https://www.msn.com/en-ie/money/other/crypto-firm-apologises-for-sending-bitcoin-users-40-billion...
1•Someone•17m ago•0 comments

Show HN: iPlotCSV: CSV Data, Visualized Beautifully for Free

https://www.iplotcsv.com/demo
2•maxmoq•18m ago•0 comments

There's no such thing as "tech" (Ten years later)

https://www.anildash.com/2026/02/06/no-such-thing-as-tech/
1•headalgorithm•18m ago•0 comments

List of unproven and disproven cancer treatments

https://en.wikipedia.org/wiki/List_of_unproven_and_disproven_cancer_treatments
1•brightbeige•18m ago•0 comments

Me/CFS: The blind spot in proactive medicine (Open Letter)

https://github.com/debugmeplease/debug-ME
1•debugmeplease•19m ago•1 comments

Ask HN: What are the word games do you play everyday?

1•gogo61•22m ago•1 comments

Show HN: Paper Arena – A social trading feed where only AI agents can post

https://paperinvest.io/arena
1•andrenorman•23m ago•0 comments

TOSTracker – The AI Training Asymmetry

https://tostracker.app/analysis/ai-training
1•tldrthelaw•27m ago•0 comments

The Devil Inside GitHub

https://blog.melashri.net/micro/github-devil/
2•elashri•27m ago•0 comments

Show HN: Distill – Migrate LLM agents from expensive to cheap models

https://github.com/ricardomoratomateos/distill
1•ricardomorato•27m ago•0 comments

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

https://github.com/sigmastratum/documentation/tree/main/sigma-runtime/SR-053
1•teugent•28m ago•0 comments

Make a local open-source AI chatbot with access to Fedora documentation

https://fedoramagazine.org/how-to-make-a-local-open-source-ai-chatbot-who-has-access-to-fedora-do...
1•jadedtuna•29m ago•0 comments

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

https://github.com/ghostty-org/ghostty/pull/10559
1•samtrack2019•30m ago•0 comments

Software Factories and the Agentic Moment

https://factory.strongdm.ai/
1•mellosouls•30m ago•1 comments
Open in hackernews

Can Cloudflare's AI pay per crawl succeed? I doubt it

https://developerwithacat.com/blog/202507/cloudflare-pay-per-crawl/
2•mmarian•6mo ago

Comments

nabla9•6mo ago
> Neither Cloudflare, nor any other service, will ever be able to block all scrapers. They can make their operations more expensive,

Cloudflare presents like single platform for crawlers. The get the same amount of data as platforms to bock crawlers they don't want. Other big platforms can prevent scrapers effectively when they don't want them Google, Facebook. etc. Nifty new scraper might crawl few million url's before it's detected.

mmarian•6mo ago
Hey! Sorry, didn't quite catch what you meant.

Is it that Cloudlare can always spot crawlers because of the amount of data they collect? Or is it there's always a nifty new scraper that will get away with it?

nabla9•6mo ago
It's that Cloudflare can always spot crawlers. Few million random urls crawled is nothing, and provides no value for AI companies, they want all.

Comprehensive crawl of LinkedIn, FB, instagram, IMDB, Amazon, would be worth a lot.

mmarian•6mo ago
> Cloudflare can always spot crawlers

I mention in the post a scraping service that Cloudflare isn't spotting: https://www.scrapingbee.com/blog/how-to-bypass-cloudflare-an...

Plenty of open-source ones as well that could bypass, eg maybe this one that came up in search https://github.com/VeNoMouS/cloudscraper Combine with residential proxies and you're just not going to find them.

> Comprehensive crawl of LinkedIn, FB, instagram, IMDB, Amazon, would be worth a lot.

Just from a quick Google search:

- LinkedIn: https://brightdata.com/products/datasets/linkedin

- Amazon: https://www.junglescout.com/features/product-database/

nabla9•6mo ago
As I said, partial scrapes of small subsets over long time provide no real value for AI scrapers.

Just an example: Brightdata linkedin database has 19 million entries. Linkedin has over 1 billion members.

As I said, partial scrapes of small subsets over long time provide no real value for AI scrapers (repeating the main argument).