There’s more info in the post. Tldr, I used COLMAP to extract camera poses from video frames, then three.js to align and render the video textures.
I'm happy with how the content is loaded, it uses a video element so the browser will be able to start loading frames before any of the component code is initialised.
Check it out and let me know what you think, or if there’s any good ideas of videos I should try.