I’m Mo. I’m currently building a startup and wanted a way to listen to research papers for inspiration while commuting or washing dishes.
I tried using Google’s NotebookLM, but the output didn't stick for me. It felt a bit robotic, the conversations were too short, and it didn't go deep enough into the technical details.
So I built PaperBot FM to fix that for myself.
It takes a research paper (plus up to 2 supporting papers for context) and synthesizes a podcast episode. Right now, the episodes are averaging around 30 minutes, though I'm still tweaking the length.
The Tech: The main challenge was the audio. I couldn't find a TTS service that handled 3 distinct voices effectively in a single conversation flow. To solve this, I built a custom wrapper around Gemini TTS that orchestrates 3 separate "personas" to keep the dynamic interesting.
How it runs: Currently, the site is just a daily community feed. Users submit papers, vote, and the system generates one episode every 24h based on the winner. It's completely free and all episodes are public.
What's next?
I'm gauging interest on two things:
- Turning this into a service where you can generate episodes on demand (for explainers, internal docs, etc).
- Opening up the voice orchestration as an API, since finding a service that supports more than 2 concurrent voices was surprisingly hard.
Hope you like it! And please let me know if you’d be interested in generating custom episodes or voice generation api.