frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Ultrathin business card runs a fluid simulation

https://github.com/Nicholas-L-Johnson/flip-card
105•wompapumpum•55m ago•31 comments

GPT-5

https://openai.com/gpt-5/
1835•rd•19h ago•2166 comments

Linear sent me down a local-first rabbit hole

https://bytemash.net/posts/i-went-down-the-linear-rabbit-hole/
219•jcusch•6h ago•70 comments

Window Activation

https://blog.broulik.de/2025/08/on-window-activation/
22•LorenDB•4d ago•2 comments

Amtrak NextGen Acela Debuts on August 28

https://media.amtrak.com/2025/08/amtrak-nextgen-acela-debuts-on-august-28/
16•reimbar•1h ago•19 comments

Flipper Zero dark web firmware bypasses rolling code security

https://www.rtl-sdr.com/flipperzero-darkweb-firmware-bypasses-rolling-code-security/
367•lq9AJ8yrfs•15h ago•212 comments

Historical Tech Tree

https://www.historicaltechtree.com/
426•louisfd94•17h ago•96 comments

How we enforce .NET coding standards to improve productivity

https://anthonysimmon.com/workleap-dotnet-coding-standards/
45•fratellobigio•3d ago•16 comments

Cursor CLI

https://cursor.com/cli
307•gonzalovargas•15h ago•208 comments

OpenAI's new open-source model is basically Phi-5

https://www.seangoedecke.com/gpt-oss-is-phi-5/
326•emschwartz•17h ago•167 comments

GPT-5: Key characteristics, pricing and system card

https://simonwillison.net/2025/Aug/7/gpt-5/
563•Philpax•18h ago•240 comments

How Attention Sinks Keep Language Models Stable

https://hanlab.mit.edu/blog/streamingllm
24•pr337h4m•3h ago•3 comments

A love letter to my future employer (2020)

https://catzkorn.dev/blog/love-letter/
32•luu•6h ago•3 comments

Virtual Linux Devices on ARM64

https://underjord.io/500-virtual-linux-devices-on-arm64.html
25•lawik•4d ago•2 comments

What Is Popover=Hint?

https://una.im/popover-hint/
20•speckx•3d ago•3 comments

Exit Tax: Leave Germany before your business gets big

https://eidel.io/exit-tax-leave-germany-before-your-business-gets-big/
261•olieidel•18h ago•324 comments

GPT-5 for Developers

https://openai.com/index/introducing-gpt-5-for-developers
428•6thbit•19h ago•233 comments

FLUX.1-Krea and the Rise of Opinionated Models

https://www.dbreunig.com/2025/08/04/the-rise-of-opinionated-models.html
23•dbreunig•3d ago•5 comments

Turn any website into an API

https://www.parse.bot
34•pcl•7h ago•6 comments

Writing a storage engine for Postgres: An in-memory table access method (2023)

https://notes.eatonphil.com/2023-11-01-postgres-table-access-methods.html
75•ibobev•4d ago•9 comments

Encryption made for police and military radios may be easily cracked

https://www.wired.com/story/encryption-made-for-police-and-military-radios-may-be-easily-cracked-researchers-find/
194•mikece•18h ago•117 comments

Cursed Knowledge

https://immich.app/cursed-knowledge/
371•bqmjjx0kac•13h ago•108 comments

Achieving 10,000x training data reduction with high-fidelity labels

https://research.google/blog/achieving-10000x-training-data-reduction-with-high-fidelity-labels/
126•badmonster•15h ago•21 comments

Building Bluesky comments for my blog

https://natalie.sh/posts/bluesky-comments/
330•g0xA52A2A•20h ago•122 comments

How AI conquered the US economy: A visual FAQ

https://www.derekthompson.org/p/how-ai-conquered-the-us-economy-a
245•rbanffy•1d ago•198 comments

Windows XP Professional

https://win32.run/
385•pentagrama•22h ago•213 comments

Claude Code IDE integration for Emacs

https://github.com/manzaltu/claude-code-ide.el
762•kgwgk•1d ago•255 comments

Benchmark Framework Desktop Mainboard and 4-node cluster

https://github.com/geerlingguy/ollama-benchmark/issues/21
177•geerlingguy•18h ago•52 comments

I don't read your email threads

https://loganmarek.com/i-dont-read-your-threads/
22•xvok•1h ago•23 comments

Infinite Pixels

https://meyerweb.com/eric/thoughts/2025/08/07/infinite-pixels/
239•OuterVale•23h ago•55 comments
Open in hackernews

Turn any website into an API

https://www.parse.bot
34•pcl•7h ago

Comments

runningmike•6h ago
Nice idea. In practice many sites have different methods to prevent scraping. Large risk on doing things manually imho.
renegat0x0•4h ago
Huh, I I have been working on solution to that problem.

My project allows to define rules for various sites, so eventually everything is scraped correctly. For YouTube yet dlp is also used to augment results.

I can crawl using requests, selenium, Httpx and others. Response is via json so it easy to process.

The downside is that it may not be the fastest solution, and I have not tested it against proxies.

https://github.com/rumca-js/crawler-buddy

with•5h ago
pretty cool idea. using stagehand under the hood?
vin047•4h ago
No information on pricing on the site.
thrdbndndn•3h ago
I scrape website content regularly (usually as one-offs) and have a hand-crafted extractor template where I just fill in a few arguments (mainly CSS selectors and some options) to get it working quickly. These days, I do sometimes ask AI to do this for me by giving it the HTML.

The issue is that for any serious use of this concept, some manual adjustment is almost always needed. This service says, "Refine your scraper at any time by chatting with the AI agent," but from what I can tell, you can't actually see the code it generates.

Relying solely on the results and asking the AI to tweak them can work, but often the output is too tailored to a specific page and fails to generalize (essentially "overfitting.") And surprisingly, this back-and-forth can be more tedious and time-consuming than just editing a few lines of code yourself. Also if you can't directly edit the code behind the scenes, there are situations where you'll never be able to get the exact result you want, no matter how much you try to explain it to the AI in natural language.

websiteapi•17m ago
I'm surprised (and could be wrong), no one has made a chrome extension that just controls a page and exposes the output to localhost for consumption as an API. Similar to using chrome web driver, but without the setup.