frontpage.

Yapit converts PDFs and web pages to audio, with a vision-LLM pipeline that handles math and complex layout instead of garbling them. I built it because I read a lot of papers and content online, but drift off after two paragraphs. Listening while following along keeps me focused and lowers the bar to actually start.

Every TTS tool I tried broke on complex formatting. Papers with math, citations, figure references, page numbers in the middle of sentences. You either get garbled output or you're listening to raw LaTeX.

Yapit converts everything to markdown as a common format. For web pages, defuddle (https://github.com/kepano/defuddle) handles the extraction and strips clutter from web pages, presenting the main article content in a clean, consistent format. For PDFs, a vision LLM rewrites each page into markdown with annotation tags that separate what you see from what gets read aloud. Math is rendered visually but gets spoken alt text. Citations like "[13]" or "(Schmidhuber, 1970)" are silently displayed. Page numbers and headers are removed entirely.

Both extraction and audio are cached by content hash, so the same content is never processed or synthesized twice.

Self-hosting works with any OpenAI-compatible TTS server (vLLM-Omni, ...) and any OpenAI-compatible vision model for PDF extraction:

  git clone --depth 1 https://github.com/yapit-tts/yapit.git && cd yapit
  cp .env.selfhost.example .env.selfhost
  make self-host

Kokoro TTS also runs in the browser via WebGPU on desktop.

Try it on Attention Is All You Need (all voices cached, no account needed): https://yapit.md/listen/3bde213b-3a5a-465f-9198-be65430b699e...

Or paste any URL: https://yapit.md/https://arxiv.org/abs/1810.04805 https://yapit.md/https://x.com/karpathy/status/2039805659525...

GitHub: https://github.com/yapit-tts/yapit (AGPL-3)

Show HN: I Built Paul Graham's Intellectual Captcha Idea

Show HN: I built a tiny LLM to demystify how language models work

Show HN: Real-time AI (audio/video in, voice out) on an M3 Pro with Gemma E2B

Show HN: GovAuctions lets you browse government auctions at once

Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud

Show HN: Weird Clocks

Show HN: I made a YouTube search form with advanced filters

Show HN: ReverseCam – See yourself as others see you

Show HN: Tiny TUI for disk usage exploration

Show HN: MCP 2000 – Browser-based drum machine with AI-generated sounds

Show HN: Modo – I built an open-source alternative to Kiro, Cursor, and Windsurf

Show HN: I replaced Google Analytics with my own tool – no cookies, <1KB script

Show HN: I just built a MCP Server that connects Claude to all your wearables

Show HN: A game where you build a GPU

Show HN: I built a 2-min quiz that shows you how bad you are at estimating

Show HN: Ec – terminal native 3-way Git mergetool

Show HN: I made a crossword app for language learners

Show HN: Yapit – PDF and webpage reader with TTS that doesn't suck

Show HN: OsintRadar – Curated directory for osint tools

Show HN: M. C. Escher spiral in WebGL inspired by 3Blue1Brown

Show HN: Contrapunk – Real-time counterpoint harmony from guitar input

Show HN: I built a small app for FSI German Course

Show HN: I developed a node editor framework using gpui

Show HN: Multi-agent coding assistant with a sandboxed Rust execution engine

Show HN: I built a frontpage for personal blogs

Show HN: Apfel – The free AI already on your Mac

Show HN: Aiaiai.guide: Plain-English mental model for LLM apps, tools and agents

Show HN: sllm – Split a GPU node with other developers, unlimited tokens

Show HN: I made open source, zero power PCB hackathon badges

Show HN: Mdarena – Benchmark your Claude.md against your own PRs

Show HN: Yapit – PDF and webpage reader with TTS that doesn't suck

Show HN: I Built Paul Graham's Intellectual Captcha Idea

Show HN: I built a tiny LLM to demystify how language models work

Show HN: Real-time AI (audio/video in, voice out) on an M3 Pro with Gemma E2B

Show HN: GovAuctions lets you browse government auctions at once

Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud

Show HN: Weird Clocks

Show HN: I made a YouTube search form with advanced filters

Show HN: ReverseCam – See yourself as others see you

Show HN: Tiny TUI for disk usage exploration

Show HN: MCP 2000 – Browser-based drum machine with AI-generated sounds

Show HN: Modo – I built an open-source alternative to Kiro, Cursor, and Windsurf

Show HN: I replaced Google Analytics with my own tool – no cookies, <1KB script

Show HN: I just built a MCP Server that connects Claude to all your wearables

Show HN: A game where you build a GPU

Show HN: I built a 2-min quiz that shows you how bad you are at estimating

Show HN: Ec – terminal native 3-way Git mergetool

Show HN: I made a crossword app for language learners

Show HN: Yapit – PDF and webpage reader with TTS that doesn't suck

Show HN: OsintRadar – Curated directory for osint tools

Show HN: M. C. Escher spiral in WebGL inspired by 3Blue1Brown

Show HN: Contrapunk – Real-time counterpoint harmony from guitar input

Show HN: I built a small app for FSI German Course

Show HN: I developed a node editor framework using gpui

Show HN: Multi-agent coding assistant with a sandboxed Rust execution engine

Show HN: I built a frontpage for personal blogs

Show HN: Apfel – The free AI already on your Mac

Show HN: Aiaiai.guide: Plain-English mental model for LLM apps, tools and agents

Show HN: sllm – Split a GPU node with other developers, unlimited tokens

Show HN: I made open source, zero power PCB hackathon badges

Show HN: Mdarena – Benchmark your Claude.md against your own PRs