We’ve been playing with what's truly possible for low-latency, privacy-first voice agents, and just released a demo: Agent Santa.
The entire voice-to-text-to-speech loop runs locally on a sub-$250 Nvidia Jetson Orin Nano.
The ML Stack:
- STT: OpenAI Whisper EN tiny
- LLM: LiquidAI’s 700M-parameter LFM2
- TTS: Our NeuTTS (zero-cost cloning, high quality)
The whole thing consumes under 4GB RAM and 2GB VRAM. This showcases that complex, multi-model AI can be fully deployed on edge devices today.
We'd love to hear your feedback on the latency and potential applications for this level of extreme on-device efficiency.
digdugdirk•1h ago
It looks cool, and I'm 100% behind the idea, but I'm more curious about what could be done on hardware that we all have broader access to, without requiring a standalone custom purpose device.
What are the options for midlevel (or older flagship) smartphones? Used PCs? Macbooks with broken screens?
neuphonic•2h ago
The entire voice-to-text-to-speech loop runs locally on a sub-$250 Nvidia Jetson Orin Nano.
The ML Stack:
- STT: OpenAI Whisper EN tiny - LLM: LiquidAI’s 700M-parameter LFM2 - TTS: Our NeuTTS (zero-cost cloning, high quality)
The whole thing consumes under 4GB RAM and 2GB VRAM. This showcases that complex, multi-model AI can be fully deployed on edge devices today.
We'd love to hear your feedback on the latency and potential applications for this level of extreme on-device efficiency.
digdugdirk•1h ago
What are the options for midlevel (or older flagship) smartphones? Used PCs? Macbooks with broken screens?