frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: COGext – A minimalist, open-source system monitor for Chrome (<550KB)

https://github.com/tchoa91/cog-ext
1•tchoa91•44s ago•0 comments

FOSDEM 26 – My Hallway Track Takeaways

https://sluongng.substack.com/p/fosdem-26-my-hallway-track-takeaways
1•birdculture•1m ago•0 comments

Show HN: Env-shelf – Open-source desktop app to manage .env files

https://env-shelf.vercel.app/
1•ivanglpz•5m ago•0 comments

Show HN: Almostnode – Run Node.js, Next.js, and Express in the Browser

https://almostnode.dev/
1•PetrBrzyBrzek•5m ago•0 comments

Dell support (and hardware) is so bad, I almost sued them

https://blog.joshattic.us/posts/2026-02-07-dell-support-lawsuit
1•radeeyate•6m ago•0 comments

Project Pterodactyl: Incremental Architecture

https://www.jonmsterling.com/01K7/
1•matt_d•6m ago•0 comments

Styling: Search-Text and Other Highlight-Y Pseudo-Elements

https://css-tricks.com/how-to-style-the-new-search-text-and-other-highlight-pseudo-elements/
1•blenderob•8m ago•0 comments

Crypto firm accidentally sends $40B in Bitcoin to users

https://finance.yahoo.com/news/crypto-firm-accidentally-sends-40-055054321.html
1•CommonGuy•8m ago•0 comments

Magnetic fields can change carbon diffusion in steel

https://www.sciencedaily.com/releases/2026/01/260125083427.htm
1•fanf2•9m ago•0 comments

Fantasy football that celebrates great games

https://www.silvestar.codes/articles/ultigamemate/
1•blenderob•9m ago•0 comments

Show HN: Animalese

https://animalese.barcoloudly.com/
1•noreplica•9m ago•0 comments

StrongDM's AI team build serious software without even looking at the code

https://simonwillison.net/2026/Feb/7/software-factory/
1•simonw•10m ago•0 comments

John Haugeland on the failure of micro-worlds

https://blog.plover.com/tech/gpt/micro-worlds.html
1•blenderob•10m ago•0 comments

Show HN: Velocity - Free/Cheaper Linear Clone but with MCP for agents

https://velocity.quest
2•kevinelliott•11m ago•2 comments

Corning Invented a New Fiber-Optic Cable for AI and Landed a $6B Meta Deal [video]

https://www.youtube.com/watch?v=Y3KLbc5DlRs
1•ksec•12m ago•0 comments

Show HN: XAPIs.dev – Twitter API Alternative at 90% Lower Cost

https://xapis.dev
2•nmfccodes•13m ago•1 comments

Near-Instantly Aborting the Worst Pain Imaginable with Psychedelics

https://psychotechnology.substack.com/p/near-instantly-aborting-the-worst
2•eatitraw•19m ago•0 comments

Show HN: Nginx-defender – realtime abuse blocking for Nginx

https://github.com/Anipaleja/nginx-defender
2•anipaleja•19m ago•0 comments

The Super Sharp Blade

https://netzhansa.com/the-super-sharp-blade/
1•robin_reala•21m ago•0 comments

Smart Homes Are Terrible

https://www.theatlantic.com/ideas/2026/02/smart-homes-technology/685867/
1•tusslewake•22m ago•0 comments

What I haven't figured out

https://macwright.com/2026/01/29/what-i-havent-figured-out
1•stevekrouse•23m ago•0 comments

KPMG pressed its auditor to pass on AI cost savings

https://www.irishtimes.com/business/2026/02/06/kpmg-pressed-its-auditor-to-pass-on-ai-cost-savings/
1•cainxinth•23m ago•0 comments

Open-source Claude skill that optimizes Hinge profiles. Pretty well.

https://twitter.com/b1rdmania/status/2020155122181869666
3•birdmania•23m ago•1 comments

First Proof

https://arxiv.org/abs/2602.05192
7•samasblack•25m ago•2 comments

I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

https://mohammedeabdelaziz.github.io/articles/trendscope-market-scanner
1•mohammede•27m ago•0 comments

Kagi Translate

https://translate.kagi.com
2•microflash•27m ago•0 comments

Building Interactive C/C++ workflows in Jupyter through Clang-REPL [video]

https://fosdem.org/2026/schedule/event/QX3RPH-building_interactive_cc_workflows_in_jupyter_throug...
1•stabbles•28m ago•0 comments

Tactical tornado is the new default

https://olano.dev/blog/tactical-tornado/
2•facundo_olano•30m ago•0 comments

Full-Circle Test-Driven Firmware Development with OpenClaw

https://blog.adafruit.com/2026/02/07/full-circle-test-driven-firmware-development-with-openclaw/
1•ptorrone•30m ago•0 comments

Automating Myself Out of My Job – Part 2

https://blog.dsa.club/automation-series/automating-myself-out-of-my-job-part-2/
1•funnyfoobar•31m ago•1 comments
Open in hackernews

Create Missing RSS Feeds with LLMs

https://taras.glek.net/posts/create-missing-rss-feeds-with-llms/
2•alastairr•9mo ago

Comments

PaulHoule•9mo ago
The general story about the LLM-scraper problem is that (1) "companies like OpenAI run badly implemented web crawlers to get training data" but there is (2) with LLMs scrapers could do content understanding (inference) that would make them more useful and I think the even more impactful (3) LLMs will empower people to write scrapers that would never have written them before.

I kinda laugh at (3) because it's been a running gag for me that management vastly overestimates the effort to write scrapers and crawlers because they've been burned with vastly underestimating the effort to develop what look like simple UI applications.

They usually think "this will be a hassle to maintain" but it usually isn't because: (a) the target web sites usually never change in a significant way because UI development is such a hassle and (b) the target web sites usually never change in a significant way because Google will punish them if they do [1]

It is like 10 minutes to write a scraper if you do it all the time and have an API like beautifulsoup on your fingertips, probably 20 minutes to vibe code it if you don't.

I am still using the same HTML scraper to process image galleries today that I used to process Flickr galleries back in the 00's, for a while the pattern was "fight with the OAuth to log into an API for 45 minutes" or "spend weeks figuring out how to parse MediaWiki markup" and then "get the old scraper working in less than 15 minutes". Frequently the scraper works perfectly out of the box, sometimes it works 80% out of the box, always it works 100% by adding a handful of rules.

I work on a product that has a React-based site and it seems the "state of the art" in scraping a URL [2] like

   https://example.com/item/8788481
is to download the HTML and then all the Javascript and CSS and other stuff with no cache (for every freaking page) and run the Javascript and have something scrape the content out of the DOM whereas they could just go to

   https://example.com/api/item/8788481
and get the data they want in a JSON format which could be processed like item["metadata"]["title"] or just stuffed into a JSONB column and queries any way you like. Login is not "fight with OAuth" but something like "POST username and password to https://example.com/api/login with a client that has a cookie jar" I don't really think "most people are stupid" that often but I think it all the time when web scraping is involved.

[1] they even have a patent for it! people who run online ad campaigns A/B test anything, but the last thing Google wants is for an SEO to be able to settle questions like "will my site rank higher if I put a certain phrase in a <b>?"

[2] ... as in, we see people doing it in our logs