Built a tool combining MLX Whisper + pyannote for fast local audio transcription with speaker diarization on Apple Silicon.
Key benefits: privacy-first (fully local), hardware-accelerated, automatic speaker identification, multiple output formats (TXT/SRT/JSON).
Main technical challenge was making MLX Whisper and pyannote work together despite different audio processing - solved with preprocessing pipeline.
Perfect for interviews, meetings, podcasts. Handles HuggingFace gated models with proper error handling.
torstenvl•5mo ago
Is there a reason it's ASi-only? I don't know the technical details of MLX, whether it runs or can be run on other hardware, etc.
Also, why does the HF token need to be in an environment variable and passed on the command line?