Recapio is a tool that extracts the transcript and generates structured summaries for videos (and web articles). It’s not trying to replace watching content, but rather to act as a 'Ctrl+F' for video context.
One technical challenge I faced: Dealing with auto-generated YouTube captions vs. forced captions was messy. I had to build a parser that normalizes the timestamps so that when you click a summary point, it actually seeks to the correct frame, even if the caption timing is drifting.
It has a free tier that should cover most casual usage. I’d love your feedback on the extraction quality