IceVox is a desktop voice chat where audio effects (pitch shift, echo, tremolo, vibrato, distortion, chorus, reverb) are applied in real-time before transmission. Peer-to-peer, no server, no account. Free and open source (MIT).
The interesting technical bits:
Audio processing runs in an AudioWorklet at ~128 samples/frame for sub-20ms latency. Effects are applied sender-side — the receiver gets the processed audio without doing any work. The processed stream is fed to WebRTC via replaceTrack() on the PeerJS media sender, since PeerJS ignores custom streams passed to call().
Remote audio playback uses <audio> elements via createMediaElementSource(), not createMediaStreamSource(). The latter produces silent output for WebRTC remote streams in Chromium/Electron — a known issue that cost me a few evenings.
The network is a WebRTC mesh (up to 6 peers), with PeerJS handling signaling through their free server. Video runs over separate RTCPeerConnections with SDP/ICE signaled through the existing PeerJS data channel. ICE config includes STUN + public TURN relays for users behind strict NAT.
Backstory: I originally built this in C++ with JUCE and SoundTouch on Linux. Pitch shifting and presets worked, but the project died trying to port audio drivers to Windows (ASIO/DirectSound). Rebuilt from scratch with Electron and Web Audio API. The web platform turned out to be a much better fit — AudioWorklet gives you low-level sample access without the driver headaches.
Stack: Electron, Web Audio API (AudioWorklet), WebRTC, PeerJS. Windows only for now.
I know Electron isn't everyone's favorite regarding memory footprint, but the Chromium WebRTC and Web Audio implementations made it the most pragmatic choice to get this off the ground as a solo dev.
The source is MIT licensed [1]. Would love to hear your thoughts on the architecture, or if anyone has experience optimizing Web Audio graphs further!
bjorehag•1h ago
The interesting technical bits:
Audio processing runs in an AudioWorklet at ~128 samples/frame for sub-20ms latency. Effects are applied sender-side — the receiver gets the processed audio without doing any work. The processed stream is fed to WebRTC via replaceTrack() on the PeerJS media sender, since PeerJS ignores custom streams passed to call().
Remote audio playback uses <audio> elements via createMediaElementSource(), not createMediaStreamSource(). The latter produces silent output for WebRTC remote streams in Chromium/Electron — a known issue that cost me a few evenings.
The network is a WebRTC mesh (up to 6 peers), with PeerJS handling signaling through their free server. Video runs over separate RTCPeerConnections with SDP/ICE signaled through the existing PeerJS data channel. ICE config includes STUN + public TURN relays for users behind strict NAT.
Backstory: I originally built this in C++ with JUCE and SoundTouch on Linux. Pitch shifting and presets worked, but the project died trying to port audio drivers to Windows (ASIO/DirectSound). Rebuilt from scratch with Electron and Web Audio API. The web platform turned out to be a much better fit — AudioWorklet gives you low-level sample access without the driver headaches.
Stack: Electron, Web Audio API (AudioWorklet), WebRTC, PeerJS. Windows only for now.
I know Electron isn't everyone's favorite regarding memory footprint, but the Chromium WebRTC and Web Audio implementations made it the most pragmatic choice to get this off the ground as a solo dev.
The source is MIT licensed [1]. Would love to hear your thoughts on the architecture, or if anyone has experience optimizing Web Audio graphs further!