I would like to have an AI audio model that I can train on hours of clean audio of my voice, and then feed it a very noisy audio file of a lecture I did, that it can effectively clean up to get rid of all the noise. Noise filters don't work because the audio quality is so bad. Ideally, the AI could match the same timing of the audio of the noisy feed because it is audio that came from video, so that the resulting clean audio could be put back with the video without looking like it's dubbed. I have no experience in AI, does something like this exist or could it be setup with minimal effort for someone who is otherwise technically literate (has used Linux for 20+ years)?