frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Open-source transcription that costs $0.02/hour instead of $30/month

https://github.com/braden-w/whispering
3•braden-w•6h ago
I built Whispering because I believe transcription is too fundamental a tool to be locked behind paywalls. It's a cross-platform desktop and web transcription app that turns speech into text with a keyboard shortcut, among other things.

The app lets you bring your own API key (OpenAI, Groq, etc.) and make direct calls. If you want complete privacy, it also supports local transcription. Either way, your audio never goes through any middleman servers. It's super lightweight (~22MB), built with Svelte 5 and Tauri, and works on Mac, Windows, and Linux.

I've been using it daily for the past few months and just released v7 last night. I spent a considerable amount of time developing a clean architecture that's hopefully educational to read. This is one of the most complex Svelte 5 apps in production, with extensive use of runes and TanStack Query.

I'm happy to answer questions about implementation or how to build desktop apps with this stack!

Comments

braden-w•6h ago
For those interested in the architecture: I use dependency injection at build time to share ~95% of code between desktop and web versions. Instead of maintaining separate codebases, I detect the platform and inject the appropriate service implementations.

The three-layer architecture has been particularly helpful:

- Services: https://github.com/braden-w/whispering/tree/main/apps/app/sr...: Pure functions with platform abstraction, no UI dependencies

- Query layer: https://github.com/braden-w/whispering/tree/main/apps/app/sr... : Adds reactivity, caching, and runtime dependency injection

Voice-activated mode is particularly nice when coding—you can keep your hands on the keyboard while dictating. The Svelte 5 runes + TanStack Query combination has been fantastic for managing real-time audio state.