I built this to improve upon Whisper transcription accuracy by treating multiple transcript sources (Whisper models, YouTube captions, external transcripts) as independent witnesses and having an LLM adjudicate disagreements with anonymous labels. The approach borrows from textual criticism and my earlier work on multi-source OCR error correction. Runs locally with Ollama by default (free). Longer writeup with an example survival analysis showing how each source contributes to the final output: https://medium.com/@eringger/improving-whisper-accuracy-by-m...
ringger•1h ago