The goal: a real-time, phone-based voice bot that understands and speaks Tamil (and other Indian languages), with fast enough latency for a natural back-and-forth conversation.
Stack used:
LiveKit for SIP/WebRTC and real-time audio
Sarvam (Indian STT/TTS models) for multilingual voice
OpenAI GPT-4.1 for conversation logic
Silero VAD for barge-in and turn-taking
Round-trip latency is low. It’s still early, but it worked surprisingly well—even on flaky connections.
Full write-up with stack, lessons, and demo: https://avaneeshrajkumar.substack.com/p/from-hello-to-namast...
Curious if others are experimenting with voice AI. Would love to hear use-case ideas.