I built an offline-first voice AI that runs entirely on Apple Silicon using MLX + FastAPI. It achieves <1s end-to-end latency for speech-to-speech conversations, with a minimal UI.
Repo: https://github.com/shubhdotai/offline-voice-ai
Would love feedback on performance, model choices, and other ideas...