frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

TOSTracker – The AI Training Asymmetry

https://tostracker.app/analysis/ai-training
1•tldrthelaw•57s ago•0 comments

The Devil Inside GitHub

https://blog.melashri.net/micro/github-devil/
1•elashri•1m ago•0 comments

Show HN: Distill – Migrate LLM agents from expensive to cheap models

https://github.com/ricardomoratomateos/distill
1•ricardomorato•1m ago•0 comments

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

https://github.com/sigmastratum/documentation/tree/main/sigma-runtime/SR-053
1•teugent•1m ago•0 comments

Make a local open-source AI chatbot with access to Fedora documentation

https://fedoramagazine.org/how-to-make-a-local-open-source-ai-chatbot-who-has-access-to-fedora-do...
1•jadedtuna•3m ago•0 comments

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

https://github.com/ghostty-org/ghostty/pull/10559
1•samtrack2019•3m ago•0 comments

Software Factories and the Agentic Moment

https://factory.strongdm.ai/
1•mellosouls•3m ago•1 comments

The Neuroscience Behind Nutrition for Developers and Founders

https://comuniq.xyz/post?t=797
1•01-_-•3m ago•0 comments

Bang bang he murdered math {the musical } (2024)

https://taylor.town/bang-bang
1•surprisetalk•3m ago•0 comments

A Night Without the Nerds – Claude Opus 4.6, Field-Tested

https://konfuzio.com/en/a-night-without-the-nerds-claude-opus-4-6-in-the-field-test/
1•konfuzio•6m ago•0 comments

Could ionospheric disturbances influence earthquakes?

https://www.kyoto-u.ac.jp/en/research-news/2026-02-06-0
1•geox•7m ago•0 comments

SpaceX's next astronaut launch for NASA is officially on for Feb. 11 as FAA clea

https://www.space.com/space-exploration/launches-spacecraft/spacexs-next-astronaut-launch-for-nas...
1•bookmtn•9m ago•0 comments

Show HN: One-click AI employee with its own cloud desktop

https://cloudbot-ai.com
1•fainir•11m ago•0 comments

Show HN: Poddley – Search podcasts by who's speaking

https://poddley.com
1•onesandofgrain•12m ago•0 comments

Same Surface, Different Weight

https://www.robpanico.com/articles/display/?entry_short=same-surface-different-weight
1•retrocog•14m ago•0 comments

The Rise of Spec Driven Development

https://www.dbreunig.com/2026/02/06/the-rise-of-spec-driven-development.html
2•Brajeshwar•18m ago•0 comments

The first good Raspberry Pi Laptop

https://www.jeffgeerling.com/blog/2026/the-first-good-raspberry-pi-laptop/
3•Brajeshwar•18m ago•0 comments

Seas to Rise Around the World – But Not in Greenland

https://e360.yale.edu/digest/greenland-sea-levels-fall
2•Brajeshwar•19m ago•0 comments

Will Future Generations Think We're Gross?

https://chillphysicsenjoyer.substack.com/p/will-future-generations-think-were
1•crescit_eundo•22m ago•1 comments

State Department will delete Xitter posts from before Trump returned to office

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
2•righthand•25m ago•1 comments

Show HN: Verifiable server roundtrip demo for a decision interruption system

https://github.com/veeduzyl-hue/decision-assistant-roundtrip-demo
1•veeduzyl•26m ago•0 comments

Impl Rust – Avro IDL Tool in Rust via Antlr

https://www.youtube.com/watch?v=vmKvw73V394
1•todsacerdoti•26m ago•0 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
3•vinhnx•27m ago•0 comments

minikeyvalue

https://github.com/commaai/minikeyvalue/tree/prod
3•tosh•31m ago•0 comments

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

https://github.com/eval-exec/neomacs
1•evalexec•36m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•40m ago•1 comments

How I grow my X presence?

https://www.reddit.com/r/GrowthHacking/s/UEc8pAl61b
2•m00dy•42m ago•0 comments

What's the cost of the most expensive Super Bowl ad slot?

https://ballparkguess.com/?id=5b98b1d3-5887-47b9-8a92-43be2ced674b
1•bkls•43m ago•0 comments

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
5•okaywriting•49m ago•0 comments

Hacking up your own shell completion (2020)

https://www.feltrac.co/environment/2020/01/18/build-your-own-shell-completion.html
2•todsacerdoti•52m ago•0 comments
Open in hackernews

Can Cloudflare's AI pay per crawl succeed? I doubt it

https://developerwithacat.com/blog/202507/cloudflare-pay-per-crawl/
2•mmarian•6mo ago

Comments

nabla9•6mo ago
> Neither Cloudflare, nor any other service, will ever be able to block all scrapers. They can make their operations more expensive,

Cloudflare presents like single platform for crawlers. The get the same amount of data as platforms to bock crawlers they don't want. Other big platforms can prevent scrapers effectively when they don't want them Google, Facebook. etc. Nifty new scraper might crawl few million url's before it's detected.

mmarian•6mo ago
Hey! Sorry, didn't quite catch what you meant.

Is it that Cloudlare can always spot crawlers because of the amount of data they collect? Or is it there's always a nifty new scraper that will get away with it?

nabla9•6mo ago
It's that Cloudflare can always spot crawlers. Few million random urls crawled is nothing, and provides no value for AI companies, they want all.

Comprehensive crawl of LinkedIn, FB, instagram, IMDB, Amazon, would be worth a lot.

mmarian•6mo ago
> Cloudflare can always spot crawlers

I mention in the post a scraping service that Cloudflare isn't spotting: https://www.scrapingbee.com/blog/how-to-bypass-cloudflare-an...

Plenty of open-source ones as well that could bypass, eg maybe this one that came up in search https://github.com/VeNoMouS/cloudscraper Combine with residential proxies and you're just not going to find them.

> Comprehensive crawl of LinkedIn, FB, instagram, IMDB, Amazon, would be worth a lot.

Just from a quick Google search:

- LinkedIn: https://brightdata.com/products/datasets/linkedin

- Amazon: https://www.junglescout.com/features/product-database/

nabla9•6mo ago
As I said, partial scrapes of small subsets over long time provide no real value for AI scrapers.

Just an example: Brightdata linkedin database has 19 million entries. Linkedin has over 1 billion members.

As I said, partial scrapes of small subsets over long time provide no real value for AI scrapers (repeating the main argument).