I like my SwiftKey keyboard though, so I did not want to replace that. So the only way was to make a floating push-to-talk button on top of any app.
You tap the overlay, speak, tap again, transcribe, and insert text into the currently focused field.
It supports local on-device transcription, cloud transcription with your own OpenAI key, and optional post-processing/cleanup for punctuation, formatting, prompts, commands, etc.
A nice use case for me has been Termux / terminal workflows on Android. You have a "dev mode" where you can just say "command mode" and anything after it will be converted into a proper CLI command.
The app is open source. No backend — in cloud mode requests go directly from the phone to OpenAI using the user's own API key.
Repo: https://github.com/kafkasl/phone-whisper APK: https://github.com/kafkasl/phone-whisper/releases