Show HN: Free audiobooks with synchronized text for language learning

9•floo•5h ago

Comments

floo•5h ago

hey HN! this is my attempt at language learning with audiobooks. it synchronizes text to speech, and shows translations. the audiobooks themselves are all public domain.

got all of the audio alignment, translation, and asset generation working on my gaming computer. pretty happy with the pipeline, except for the sometimes subpar translations.

if anyone is interested in the details I am happy to write them up!

if you are into language learning, I would love to hear if this could be useful to you!

aanet•3h ago

This is fantastic!

I've been meaning to learn Spanish, and this looks super useful.

Would love to learn more about your pipeline [selfishly, I was looking to build (free) ebooks -> audio for my own purposes as a side project]

What were the most challenging aspects? What assumptions failed / held true? Any experiences to share? Thx

floo•3h ago

glad to hear it!

went through quite a few iterations of aligning text to speech. found that ai transcription was really good most of the time but would hallucinate quite a bit towards the start and end of books. which I think might be related to those models being partially trained on audiobooks, and only having the book text itself, without any of the intro or credits.

in the end I landed on extracting text from ebooks, using rule based and language specific segmentation, and espeak based alignment. pretty basic, but it worked wonders in terms of reliability and accuracy.

if you are looking to generate audio from ebooks this is probably not too helpful. it is something I tried to avoid. something about learning a languages from generated audio didn't sit right with me haha.

_popeye•2h ago

This is great! More beginner level stories would be much appreciated.

floo•1h ago

thanks, that's a really good point. having some beginner friendly books for each language is definitely a goal.

are you looking for stories in a specific language?

nbcesar•1h ago

Looks great - Exactly what I’m looking for. Could we get different dialects? For Spanish, I would love to be able to select a country for the audio. At least a Latin American version to start. Thanks for sharing.

floo•1h ago

cool idea. haven't really explored dialects yet. gonna see if I can find any latin american recordings. thanks for the suggestion!

Show HN: s@: decentralized social networking over static sites

Show HN: I built a tool that watches webpages and exposes changes as RSS

Show HN: Autoresearch@home

Show HN: A context-aware permission guard for Claude Code

Show HN: Klaus – OpenClaw on a VM, batteries included

Show HN: Open-source browser for AI agents

Show HN: Satellite imagery object detection using text prompts

Show HN: I built an ISP infrastructure emulator from scratch with a custom vBNG

Show HN: K9 Audit – Causal intent-execution audit trail for AI agents

Show HN: I built an open harness that excels at autonomous ML research

Show HN: Vanilla JavaScript refinery simulator built to explain job to my kids

Show HN: Free audiobooks with synchronized text for language learning

Show HN: Gitingest for Jupyter Notebook Accessibility

Show HN: LLM Observability Stack for Local Dev – Agent Super Apy

Show HN: Manage Cursor agents from your smartphone

Show HN: Bus Core 1.0.3 Local-first manufacturing system for small shops

Show HN:Conduit–Headless browser with SHA-256 hash chain - Ed25519 audit trails

Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs

Show HN: Ink – Deploy full-stack apps from AI agents via MCP or Skills

Show HN: DD Photos – open-source photo album site generator (Go and SvelteKit)

Show HN: I Was Here – Draw on street view, others can find your drawings

Show HN: What's my JND? – a colour guessing game

Show HN: Rewriting Mongosh in Golang Using Claude

Show HN: Joha – a free browser-based drawing playground with preset shape tools

Show HN: Ash, an Agent Sandbox for Mac

Show HN: OpenUI – A code-like rendering spec for Generative UI

Show HN: Loquix – Open-source Web Components for AI chat interfaces

Show HN: Modulus – Cross-repository knowledge orchestration for coding agents

Show HN: StreamHouse – Open-source Kafka alternative

Show HN: PayrollEngine – Open-source regulation-based payroll framework (.NET)

Show HN: Free audiobooks with synchronized text for language learning

Comments

Show HN: s@: decentralized social networking over static sites

Show HN: I built a tool that watches webpages and exposes changes as RSS

Show HN: Autoresearch@home

Show HN: A context-aware permission guard for Claude Code

Show HN: Klaus – OpenClaw on a VM, batteries included

Show HN: Open-source browser for AI agents

Show HN: Satellite imagery object detection using text prompts

Show HN: I built an ISP infrastructure emulator from scratch with a custom vBNG

Show HN: K9 Audit – Causal intent-execution audit trail for AI agents

Show HN: I built an open harness that excels at autonomous ML research

Show HN: Vanilla JavaScript refinery simulator built to explain job to my kids

Show HN: Free audiobooks with synchronized text for language learning

Show HN: Gitingest for Jupyter Notebook Accessibility

Show HN: LLM Observability Stack for Local Dev – Agent Super Apy

Show HN: Manage Cursor agents from your smartphone

Show HN: Bus Core 1.0.3 Local-first manufacturing system for small shops

Show HN:Conduit–Headless browser with SHA-256 hash chain - Ed25519 audit trails

Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs

Show HN: Ink – Deploy full-stack apps from AI agents via MCP or Skills

Show HN: DD Photos – open-source photo album site generator (Go and SvelteKit)

Show HN: I Was Here – Draw on street view, others can find your drawings

Show HN: What's my JND? – a colour guessing game

Show HN: Rewriting Mongosh in Golang Using Claude

Show HN: Joha – a free browser-based drawing playground with preset shape tools

Show HN: Ash, an Agent Sandbox for Mac

Show HN: OpenUI – A code-like rendering spec for Generative UI

Show HN: Loquix – Open-source Web Components for AI chat interfaces

Show HN: Modulus – Cross-repository knowledge orchestration for coding agents

Show HN: StreamHouse – Open-source Kafka alternative

Show HN: PayrollEngine – Open-source regulation-based payroll framework (.NET)