I built this after paying monthly for a transcription app and wanting Cursor's voice-to-prompt feature everywhere. Press a shortcut, speak, get text in clipboard. Works in emails, Slack, Teams, code editors—anywhere.
Three modes:
Transcription: Shortcut → speak → text in clipboard. Uses Gemini Flash for speed (lowest latency). Whisper available for offline.
Voice-to-Prompt: Select text, press shortcut, speak instruction. Gemini processes selected text + voice instruction. Result goes to clipboard. Like Cursor's prompt-on-selection, but system-wide.
Read Aloud: Select text, press shortcut, speak command (or stay silent). If command: applies prompt first, then reads result. If no command: reads selected text directly.
Use cases: dictating prompts in Cursor, formatting messages, summarizing articles.
Native Swift/Cocoa app. Open source. Daily driver for me.
https://github.com/mgsgde/whisper-shortcut
Demo: https://youtu.be/yz8cbaI6NYQ.