It’s a "black box" flow: upload audio -> Whisper (sync) -> Gemini (director/script) -> Veo/Seedance (synthesis). It handles everything from LRC generation to final rendering.
Supports Realistic, Cartoon, and Abstract styles. Would love to hear your feedback on the shot consistency.