I must say, speaker diarization is surprisingly tricky to do. The most common approach seems to be to use pyannote, but the quality is not amazing...
yt-dlp --write-auto-subs --skip-download "https://www.youtube.com/watch?v=7xTGNNLPyMI"
(with that said, I do not want to diminish OP's work in any way; great job! "What I cannot build, I do not understand" - Feynman)
I ended up doing the same as this person, downloading the MP4s and then transcribing myself. I was assuming it was some sort of anti LLM scraper feature they put in place.
Has anyone used this --write-auto-subs flag and not been flagged after doing 20 or so videos?
My startup has to utilize youtube transcriptions so we just subscribe to a youtube transcriptor api hosted on rapidapi that downloads subtitles. 1$ per 1000 reqs. Pretty cheap
(I'm using it in https://butter.sonnet.io)
And unlike how your tool will be supported in the future, thousands of users make sure yt-dlp keeps working as google keep changing the site (currently 1459 contributors).
youtube also blocks transcript exports for some things like https://youtubetranscript.com/
retranscribing is necessary and important part of the creator toolset.
- This python one is more amenable to modding into your own custom tool: https://hw.leftium.com/#/item/44353447
- Another bash script: https://hw.leftium.com/#/item/41473379
---
They all seem to be built on top of:
- yt-dlp to download video
- whisper for transcription
- ffmpeg for audio/video extraction/processing
https://github.com/Dicklesworthstone/bulk_transcribe_youtube...
I ended up turning a beefed up version of it which makes polished written documents from the raw transcript, you can try it at
https://en.m.wikipedia.org/wiki/Specht_v._Netscape_Communica...
https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2
For Apple Silicon (MLX) https://huggingface.co/senstella/parakeet-tdt-0.6b-v2-mlx
cmaury•5h ago
Bluestein•5h ago
And, yes, indeed, AI-coding is order-of-magnitude having an effect along the lines that "low-code" was treading ...
... also, for less-capable coders or "borderline" coders the effort/benefit equation has radically shifted.-
sannysanoff•4h ago
https://old.reddit.com/r/ChatGPTCoding/comments/1lusr07/self...
Gonna be lots of posts of selfware like that soon.
Bluestein•4h ago
sannysanoff•4h ago
cmaury•2h ago