How it works:
1. Input: URL / PDF upload / photo of printed text (OCR) 2. AI cleanup: strips ads, fixes formatting artifacts, optimizes for speech 3. TTS: Google Cloud Chirp HD voices (6 voices, adjustable speed)
No signup. No account. 3 free articles per day. Just paste and listen.
Tech stack: Next.js 14, Google Cloud TTS (Chirp 3 HD + Standard), Google Cloud Vision (OCR), Claude Haiku (content cleanup), Upstash Redis (rate limiting), Vercel.
Some decisions I'm happy with:
- Google TTS primary, OpenAI TTS fallback. Switched after Google TTS cut my costs from $77/4 days to ~$2/day.
- Mozilla Readability for extraction, Firecrawl as fallback, Claude Haiku to strip "Subscribe to read more" junk before TTS.
- Purchase codes instead of accounts- buy a credit pack, get a portable code that works across devices. No auth needed.
- PDF extraction uses pdf-parse → Claude cleanup. Photo uses Cloud Vision documentTextDetection → Claude cleanup. Both feed into the same TTS pipeline.
- $50/day spending cap in code so I don't wake up to a surprise bill.
What I'm still figuring out:
(1) Chrome extension (the real distribution play- it's how Speechify grew) (2) Whether to move from credit packs to subscriptions- users don't really like subs unless it a very famous website (3) Getting scanned/image-based PDFs working (currently only text-based PDFs)
Would love feedback on voice quality, extraction accuracy, and whether the no-signup approach makes sense long-term.
Try it: https://sornic.com Telegram bot: https://t.me/SornicBot