frontpage.

Show HN: Ava – AI Voice Agent for Traditional Phone Systems(Python+Asterisk/ARI)

https://github.com/hkjarral/AVA-AI-Voice-Agent-for-Asterisk

4•hkjarral•1h ago

Hi HN, I'm the creator of AVA - AI Voice Agent for Asterisk

My repo was shared here once before by someone else so I wanted to follow up with the progress since then.

https://news.ycombinator.com/item?id=46380399

I've been working with Asterisk/FreePBX systems for years. I wanted to add AI voice capabilities to legacy phone systems without paying per-minute SaaS fees or ripping out the entire telephony stack.

So I built AVA, a self-hosted AI voice agent that can integrate into any traditional phone system. While most solutions demand expensive migrations to cloud-only providers, AVA provides a self-hosted path to connect AI agents to existing phone systems while ensuring data privacy and lowering operational costs

AVA is a Dockerized Python app that sits alongside your Asterisk server. It connects via ARI (Asterisk REST Interface) and routes call audio to AI providers — OpenAI Realtime, Deepgram, Google Live API, ElevenLabs, Telnyx, or fully local models (Vosk + llama.cpp + Piper). You can mix and match STT/LLM/TTS in a modular pipeline, or use a single provider end-to-end.

Two audio transport paths: We support both AudioSocket (low-latency TCP with TLV framing) and ExternalMedia RTP (UDP, better for NAT). A transport orchestrator auto-negotiates sample rates and codecs between what Asterisk sends on the wire and what each AI provider expects — so you can run 8kHz ulaw from Asterisk into a provider that wants 24kHz linear16 without manual config.

Session lifecycle: A typed session store tracks every call from StasisStart through hangup — audio diagnostics, barge-in counts, provider state, conversation turns. Every call is fully observable and debuggable after the fact.

Barge-in and VAD were the hardest problems. We use a dual-mode VAD — WebRTC VAD combined with energy-based RMS detection, scored into a single confidence value (40% WebRTC weight, 40% energy ratio, 20% agreement bonus). Frame smoothing prevents single-frame glitches from triggering false interrupts. When barge-in fires, we kill active playback (both streaming and file-based) via ARI, flush provider audio buffers, release conversation gating tokens, and optionally suppress provider output for a configurable window to prevent pre-barge audio from re-queuing. The system supports three interrupt sources: local VAD, Asterisk's native talk detection events, and provider-side interruption signals.

The hardest latency challenge was bridging legacy SIP/RTP with modern WebSocket streams. We use a two-container architecture: a lightweight orchestrator for ARI state management and an optional heavier container for local model inference. There are 6 pre-validated golden baseline configs if you just want something working out of the box, plus an Admin UI for visual setup.

Try the live demo: (925)-736-6718 Option 5 for Google, 6 for Deepgram, 7 for Openai realtime, 8 for Local hybrid and 9 for Elevenlabs

Code is MIT. I'd love feedback on the transport layer (src/core/transport_orchestrator.py) and the VAD tuning (src/core/vad_manager.py).

Bringing Chrome to ARM64 Linux Devices

MCP tools for AI-native ontology engineering (Rust and Oxigraph)

The Generational Prisoner's Dilemma: Three Certain Truths

The Official GBBS Pro Repository

You can turn Claude's most annoying feature off

Agentic Evidence

Ask HN: Do you struggle analyzing large log files with AI due to token limits?

Sitka's excellent Gravity clock Eurorack module is now open source

Show HN: RestaRules – A robots.txt for how AI agents interact with restaurants

One More Prompt: The Dopamine Trap of Agentic Coding

RAF vs. Usaaf: The Bombing Doctrine Split That Divided the Allies

Grok 4.20 brings minimal improvements for Grok-4.1-fast

Show HN: Slack but Usage Priced

Can LLMs Be Computers?

Spacetime Quasicrystals

Show HN: I'm building niche AI agents with OpenClaw (Clawsify)

Deepak Jain to Host Two Sessions at Nvidia GTC 2026

You Can Stop Marrying for Genes

Amazon Employees Say AI Is Just Increasing Workload

Ask HN: Developing .NET on Windows but deploying to Linux – common issues?

Boeing's Bizarre Planes That Were Never Built

On Making

Zeno's Paradox resolved by physics, not by math alone

What's That? – Photo to personalized audio narrative in under 10 seconds

The Dopamine Trap of Vibe Coding

MCPs, CLIs, and skills: when to use what?

A Snapshotable WASM Interpreter

Looking for Partner to Build Agent Memory (Zig/Erlang)

Show HN: Codex Symphony – bootstrap OpenAI Symphony and Linear in any repo

How to use Claude Cowork – Complete guide