frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Can Claude Read Your Website

https://johnbrennan.xyz/essay/can-claude-read-your-website
2•johnb95•2h ago

Comments

johnb95•2h ago
TL;DR We conducted a live experiment asking Claude Opus 4.6 to discover and read content across three websites built as React single-page applications with Express backends. At the start of the session, all three sites were effectively invisible — Claude received empty HTML shells with no article content, no navigation, and no discoverable paths to any content. Over several hours of iterative testing, debugging, and deployment, we identified which artifacts make a site legible to AI agents and which failures leave it dark. The single most impactful change was a plain-text sitemap (`sitemap.txt`) — one file, one URL per line, that transformed a completely opaque site into one Claude could navigate autonomously. The experiment also revealed that server-side HTML injection, structured Markdown endpoints, `llms.txt` directories, homepage discovery links, and correct MIME types each play distinct and complementary roles in AI legibility. A final test of the Unified TOON Meta-Index (`utmi.toon`) demonstrated that consolidating crawl rules, site index, AI summaries, and API tool registration into a single token-optimized file is viable and immediately useful to an AI agent — provided the file is served with a text MIME type rather than the default binary content type that web servers assign to unknown file extensions.

Key Takeaways React single-page applications are invisible to AI agents by default. Claude's fetch tools do not execute JavaScript, so any content rendered client-side does not exist from the agent's perspective. A plain-text sitemap (sitemap.txt) was the single most impactful artifact. Once provided, Claude could autonomously discover and read every piece of content on a site. Server-side HTML injection works — but edge caching can mask it entirely. A working injection pipeline appeared broken for over an hour because stale cached responses were being served. Markdown endpoints (.md) are the ideal content format for AI agents. Structured front matter, clean hierarchy, and explicit metadata allow an LLM to parse, cite, and reason about content with zero friction. Homepage discovery is the critical gap. If the homepage returns nothing navigable, an AI agent has no starting point — even if every other endpoint works perfectly. MIME types for novel file formats must be explicitly configured. A .toon file served as application/octet-stream is unreadable binary to an AI agent, regardless of how well-designed the format is. The UTMI format (utmi.toon) consolidates robots.txt, sitemaps, llms.txt, metadata, and API tool registration into a single file that Claude could parse immediately once the MIME type was corrected — demonstrating that unified site manifests are viable and useful for AI agents.

Should I be worried or reassured that my taxi driver isn't wearing a seat belt?

https://marginalrevolution.com/marginalrevolution/2026/03/advantageous-selection.html
1•mhb•25s ago•0 comments

Show HN: Sites that publish instantly and expire if unclaimed

https://unulu.ai/for-humans
1•zeebs•2m ago•1 comments

Yann LeCun's AI startup raises $1B seed round

https://www.bloomberg.com/news/articles/2026-03-10/yann-lecun-s-new-ai-startup-raises-1-billion-i...
1•brandonb•2m ago•0 comments

Show HN: VR.dev – Open-source verifiers for what AI agents did

https://www.vr.dev/
1•SkiFreeWin3•4m ago•0 comments

Show HN: Gate – deterministic write-path checkpoint for AI agents

https://zehrava.com/
1•cgallic•5m ago•0 comments

Bitcoiners celebrate as the network produces its 20Mth coin

https://cointelegraph.com/news/bitcoin-mined-20-million-executives-speculate-1-million-left
1•taubek•5m ago•0 comments

Bar charts should always start at zero. But what about line charts? (2018)

http://www.chadskelton.com/2018/06/bar-charts-should-always-start-at-zero.html
1•aleda145•5m ago•0 comments

Show HN: Parascene – a platform for AI, algorithmic, and traditional art

https://sh.parascene.com/s/v1/AA4uAAAa.V2RJl71Us2AJ/tbopzs
1•heddycrow•6m ago•0 comments

Inside a bot operator's email verification infrastructure

https://blog.castle.io/inside-a-bot-operators-email-verification-infrastructure/
1•avastel•7m ago•0 comments

How I Topped the HuggingFace Open LLM Leaderboard on Two Gaming GPUs

https://dnhkng.github.io/posts/rys/
3•dnhkng•7m ago•1 comments

Caution: Read the Docs for Claude 4.6's Effort Parameter

https://everyrow.io/blog/claude-effort-parameter
6•Bullhorn9268•10m ago•0 comments

Thinking Machines Lab and Nvidia announce gigawatt-scale AI partnership

https://thinkingmachines.ai/news/nvidia-partnership/
2•meetpateltech•10m ago•0 comments

I built a hub to organize and share all my AI prompts

https://ideaprompts.com/
1•Kamil_KKA•10m ago•1 comments

Unstructured Data and the Joy of having Something Else think for you

https://shkspr.mobi/blog/2026/03/unstructured-data-and-the-joy-of-having-something-else-think-for...
1•edent•11m ago•0 comments

K-Shaped Economy Continues

https://www.apolloacademy.com/k-shaped-expansion-continues/
1•akyuu•11m ago•0 comments

State of Interactive Product Demos 2026: Benchmarks and Trends

https://supademo.com/content/state-of-interactive-demos-2026
1•avanticc•11m ago•0 comments

A Unix Manifesto for the Age of AI

https://linuxtoaster.com/manifesto.html
3•dirk94018•12m ago•1 comments

Show HN: Smux – Terminal Multiplexer built for AI agents

https://github.com/gergomiklos/smux
3•garymiklos•12m ago•0 comments

Show HN: DD Photos – open-source photo album site generator (Go and SvelteKit)

https://github.com/dougdonohoe/ddphotos
3•dougdonohoe•12m ago•0 comments

Show HN: Local-first firmware analyzer using WebAssembly

https://xray.boldwark.com
4•asabil•13m ago•0 comments

A New Algorithmic MIDI Sequencer in Pure Python (Open Source)

https://github.com/simonholliday/subsequence
2•deepvibrations•15m ago•0 comments

Startup Ideas VCs Are Funding in 2026

https://stellisoft.com/stellify/startup-ideas-vcs-funding-2026
2•Stellisoft•15m ago•0 comments

Intel Demos Chip to Compute with Encrypted Data

https://spectrum.ieee.org/fhe-intel
3•sohkamyung•15m ago•0 comments

A usage circuit breaker for Cloudflare Workers

5•ethan_zhao•17m ago•0 comments

GPS jamming: The invisible battle in the Middle East

https://www.bbc.com/news/articles/c3ewwlx9e1xo
2•throw0101d•18m ago•0 comments

You can read the web in seasons

https://enocc.com/2025/11/12/read-web-seasonally.html
2•nyoki•18m ago•1 comments

A 100 Year Old Consul Typewriter?

https://www.os2museum.com/wp/a-100-year-old-consul-typewriter/
2•jruohonen•18m ago•0 comments

Ig Nobels to move awards to Europe due to concern over US travel visas

https://www.theguardian.com/science/2026/mar/09/ig-nobel-prize-europe
5•sohkamyung•19m ago•0 comments

What is Y Combinator Betting On?

1•Rushalee•19m ago•0 comments

PIDKill – Auto-kill rogue macOS processes on a loop

https://www.pidkill.com
1•thomasmillerGo•19m ago•2 comments