- Visualize who speaks when & for how long
- Jump/skip speaker segments
- Remove/disable speakers (auto-skip)
- Set different playback speeds for each speaker
It's a better, more efficient way to listen to podcasts, interviews, press conferences, etc.It has first-class support for YouTube videos; just drop in a URL. Also supports your local media files. All processing runs on-device.
Download today for macOS. Also works on Linux and WSL, but currently without packaging. You can get it running though with just a few terminal commands. Check out the repo for instructions: https://github.com/narcotic-sh/zanshin
Zanshin is powered by Senko, a new, very fast, speaker diarization pipeline I've developed.
On an M3 MacBook Air, it takes over 5 minutes to process 1 hour of audio using Pyannote 3.1, the leading open-source diarization pipeline. With Senko, it only takes ~24 seconds, a ~14x speed improvement. And on an RTX 4090 + Ryzen 9 7950X machine, processing 1 hour of audio takes just 5 seconds with Senko, a ~17x speed improvement.
Senko's speed is what make's Zanshin possible. Senko is a modified version of the speaker diarization pipeline found in the excellent 3D-Speaker project. Check out Senko here: https://github.com/narcotic-sh/senko
Cheers, everyone; enjoy 残心/Zanshin and Senko. I hope you find them useful. Let me know what you think!
~
Side note: I am looking for a job. If you like my work and have an opportunity for me, I'm all ears :) You can contact me at mhamzaqayyum [at] icloud.com