The transcription itself (Whisper via MLX) isn't new — plenty of people have done that. What I find useful is the combination: transcription, LLM cleanup (Qwen 2.5-3B via Ollama) and pasting wherever your cursor is. That last part is (to me) the magic of tools like WisprFlow, and it turns out you can replicate it in a single Python script, running fully locally with comparable performance to paid subscriptions.
It's a proof of concept more than a polished product, but I've been using it daily. It's a testament to our times that all of this fits in a single script.