I initially tried implementing local STT with Whisper on my MacBook Pro M3, but the latency and accuracy—especially for Korean—didn't meet my expectations for a live environment.
A few days ago, I tried ElevenLabs Scribe v2 for real-time STT and combined it with DeepL for translation. The performance was impressive enough that I decided to build this web tool (elstt.co).
Key Features:
BYOK: Use your own ElevenLabs/DeepL API keys.
Sync: Encrypted key storage for seamless use across projector, PC, and mobile.
Status: It's being live-tested at my church. Not perfect yet, but I'm refining it every week.
I'd love to hear your thoughts on the architecture or any feedback!