frontpage.

I've been building AI agents that needed web access and kept hitting the same wall: production scraping is hard. You start with Puppeteer, then add stealth plugins, then fight Cloudflare, then manage proxies, then handle browser pooling and it still breaks.

I kept solving this problem from scratch on different projects, so I packaged it up as Reader, hoping it saves others the same headaches...

Two primitives:

  const reader = new ReaderClient();
  
  // Scrape URLs → clean markdown
  const result = await reader.scrape({ urls: ["https://example.com"] });
  
  // Crawl a site → discover + scrape pages
  const pages = await reader.crawl({ url: "https://example.com", depth: 2 });

Under the hood it's built on Ulixee Hero, a headless browser designed for anti-detection. The hard stuff like TLS fingerprinting, Cloudflare/Turnstile bypass, browser pool recycling, proxy rotation is built in.

The HTML-to-markdown conversion uses supermarkdown, a Rust engine I built specifically for messy real world HTML. Clean output, no artifacts.

TypeScript first, full type safety, works as CLI or library. Apache 2.0 license.

GitHub: https://github.com/vakra-dev/reader

Happy to answer questions about the architecture, approach, or tradeoffs I made.

Would love feedback from anyone doing web scraping at scale, especially on edge cases where it breaks. That's how I can make this better.

How a software meltdown will shake private markets

Mind the GAAP Again

10 months since the Llama-4 release: what happened to Meta AI?

Show HN: An LLM-enabled bash-based shell for Linux

Volkswagen overtook Tesla as Europe's top EV seller in 2025

OpenAI and Anthropic go to war: Claude Opus 4.6 vs. GPT 5.3 Codex

Extension to Fix YT Background Play

Overseas transfer of map data could cost Korea up to $136B, study warns

'Ripping' Clips for YouTube Reaction Videos Can Violate the DMCA, Court Rules

I2P is currently facing an ongoing attack on its network

Digging into UUID, ULID, and implementing my own

Our Kona EBM a 96% vs. 2% Sudoku Benchmark

I Gave Claude Code Infinity Gauntlet of LLMs

Rdrama Down for the last 2 days

I shipped 706 commits in 5 days with Taskwarrior and Claude Code

' injection' claims in ski jump competition investigation by WADA

Corpus Lifetime free. Track gold, stocks, mutualfunds and net worth in one place

Show HN: TabAny – Start AI chats from text boxes, enables quick translations

rsync.net is down

Study: Meta AI model can reproduce almost half of Harry Potter book

Built a desktop assistant [fully local] for myself without any privacy issue

Built a desktop assistant [fully local] for myself without any privacy issue

Treasury SEC Admits Americans Are on the Hook for Trump's $10B Lawsuit

Waiting for Postgres 19: Better Planner Hints with Path Generation Strategies [video]

I reversed Tower of Fantasy's anti-cheat driver: a BYOVD toolkit never loaded

What's at the Other End of 8.8.8.8?

Californian Court Rules That Ripping YouTube Clips Can Violate the DMCA

Solving Shrinkwrap: New Experimental Technique

The Question she did not ask

Show HN: CursedFeed, a social feed where people use spells to mutate next posts

Show HN: Reader – open-source web scraping engine built for LLMs

How a software meltdown will shake private markets

Mind the GAAP Again

10 months since the Llama-4 release: what happened to Meta AI?

Show HN: An LLM-enabled bash-based shell for Linux

Volkswagen overtook Tesla as Europe's top EV seller in 2025

OpenAI and Anthropic go to war: Claude Opus 4.6 vs. GPT 5.3 Codex

Extension to Fix YT Background Play

Overseas transfer of map data could cost Korea up to $136B, study warns

'Ripping' Clips for YouTube Reaction Videos Can Violate the DMCA, Court Rules

I2P is currently facing an ongoing attack on its network

Digging into UUID, ULID, and implementing my own

Our Kona EBM a 96% vs. 2% Sudoku Benchmark

I Gave Claude Code Infinity Gauntlet of LLMs

Rdrama Down for the last 2 days

I shipped 706 commits in 5 days with Taskwarrior and Claude Code

' injection' claims in ski jump competition investigation by WADA

Corpus Lifetime free. Track gold, stocks, mutualfunds and net worth in one place

Show HN: TabAny – Start AI chats from text boxes, enables quick translations

rsync.net is down

Study: Meta AI model can reproduce almost half of Harry Potter book

Built a desktop assistant [fully local] for myself without any privacy issue

Built a desktop assistant [fully local] for myself without any privacy issue

Treasury SEC Admits Americans Are on the Hook for Trump's $10B Lawsuit

Waiting for Postgres 19: Better Planner Hints with Path Generation Strategies [video]

I reversed Tower of Fantasy's anti-cheat driver: a BYOVD toolkit never loaded

What's at the Other End of 8.8.8.8?

Californian Court Rules That Ripping YouTube Clips Can Violate the DMCA

Solving Shrinkwrap: New Experimental Technique

The Question she did not ask

Show HN: CursedFeed, a social feed where people use spells to mutate next posts