I built this to solve a problem I had when building AI agents: HTML wastes 60-95% of tokens, and Cloudflare's new "Markdown for Agents" only works on
~5% of the web (opt-in only).
THE PROBLEM:
I tested 100 popular websites with Cloudflare's Accept: text/markdown header. Only 3 actually served markdown. The rest? Still HTML. Turns out their
markdown feature requires website owners to opt-in, which most won't do for years (if ever).
MY SOLUTION:
Klovr converts any webpage to markdown on-demand:
- Same Accept headers as Cloudflare (100% compatible)
- Works on 100% of sites (no opt-in needed)
- Redis caching with 7-day TTL (10-100x speedup on repeated URLs)
- Playwright for dynamic content (better anti-detection than Puppeteer)
- Content-Signal headers for AI compliance
TECH STACK:
- Next.js 15 (App Router) + Vercel
- Playwright for browser automation
- Redis (via ioredis) for caching
- Drizzle ORM + Neon PostgreSQL
- Readability.js + Turndown for conversion
FREE TIER: 10,000 conversions/month (no credit card)
WHAT I LEARNED:
1. Puppeteer-extra doesn't work on Vercel (ESM/CommonJS conflicts)
2. Playwright has better anti-detection out of the box
3. Redis caching is critical - first request is 2000ms, cached is 50ms
4. Most sites still don't support Cloudflare's markdown (hence the need for universal conversion)
CURRENT LIMITATIONS:
- Payment processing is in development (everyone on free tier for now)
- Dynamic content (Playwright) temporarily disabled for launch (re-enabling next week)
- IP-based blocking (Reddit, LinkedIn) still happens - no way around datacenter IPs
I'd love feedback on:
- Architecture choices (should I use a different caching strategy?)
- The positioning (am I framing the Cloudflare comparison correctly?)
- What features would make this more useful for your AI agents?
GitHub isn't public yet, but happy to share code snippets for specific parts (stealth script, caching layer, etc.).
vaibhavlodha98•1h ago