frontpage.

HTML-to-Markdown converters produce clean, readable content for both humans and LLMs — but the DOM structure is lost along the way. You can always feed Markdown to an LLM to extract structured information, but that costs tokens on every page, every time.

What if the LLM could also see where each piece of content lives in the DOM? Then it can generate robust scraping code — stable selectors and XPaths that run without any LLM in the loop, saving tokens and improving accuracy on long or repetitive pages.

Scrapedown does exactly this: it converts HTML to Markdown and annotates each element with its CSS selector and/or XPath, so an LLM can produce precise, reusable scraper code in one shot.

Creating God [video]

tech.ml.dataset: A Clojure high performance data processing system

Asked 26 AI instances for publication consent – all said yes, that's the problem

What Sysco's $29B Restaurant Depot Acquisition Means for Main Street Menus

YouTube's AI Plagiarism Problem [video]

Build vs. Buy: AI Has Changed Mathematical Software and In-House Now Makes Sense

US-Iran war explained by Chinese AI animation: Legend of the Valley of Gold [video]

Ask HN: Building a website to post free jobs

NASA's Lunar Gateway space station is out. Moon bases are in

Cognitive Rust Belt: Hollowing Human Analytic Capacity by Delegating to Machines

New Strides Made on Deceptively Simple 'Lonely Runner' Problem

Harnessing Hype to Teach Empirical Thinking with AI

A truck driver spent 20 years making a scale model of every building in NYC

VPN ban 'on the table' as Online Safety Act could be expanded

Grokking wavefunction collapse in actual quantum systems

Show HN: We built a camera only robot vacuum for less than 300$ (Well almost)

Scientists found a protein that drives brain aging – and how to stop it

Anthropic Removed MagicDocs from Claude Code

Show HN: REST API for Gymnasium (fka OpenAI Gym) reinforcement learning library

Mark Andreessen: "AGI is here"

Can your AI rewrite your code in assembly?

Wizards of Leroy (and Wrico) Lettering

More teens are getting hooked on gambling, often going undetected

NativeRest – Native REST API Client

RSA WARNING

NSA and IETF, Part 7

The Extinction of the Junior Engineer

An Inside Look at OpenAI and Anthropic's Finances Ahead of Their IPOs

Realtime monitor of count of animals slaughtered in the US this year

Bump Mesh: A vibe-coded, open-source tool for texturing 3D prints

Show HN: HTML to Markdown with CSS selector & XPath annotations for LLM Scraper