Making AI video usually means jumping between 4 tools: one for the image, one for the video, another for music, and a separate app for 3D assets. It was a mess of file transfers.
I updated Textideo to orchestrate all of this in a single timeline.
What's new:
3D Meshes: Converts images to 3D assets using a fast feed-forward pipeline (<60s). It helps with spatial consistency.
Audio: Direct generation of scoring and TTS synced to the video blocks.
The Stack:
A Python backend acting as a conductor for several models (integrating Veo, Wan, and custom pipelines). The main challenge was getting the inference latency down so the "edit-preview" loop isn't painful.
There are free credits on signup to test the workflow.
The 3D geometry is decent for props but can struggle with complex organic shapes—I'd appreciate any feedback on the mesh quality.
Nancylily•2h ago
I built this to solve my own "tab fatigue."
Making AI video usually means jumping between 4 tools: one for the image, one for the video, another for music, and a separate app for 3D assets. It was a mess of file transfers. I updated Textideo to orchestrate all of this in a single timeline.
What's new: 3D Meshes: Converts images to 3D assets using a fast feed-forward pipeline (<60s). It helps with spatial consistency. Audio: Direct generation of scoring and TTS synced to the video blocks.
The Stack: A Python backend acting as a conductor for several models (integrating Veo, Wan, and custom pipelines). The main challenge was getting the inference latency down so the "edit-preview" loop isn't painful.
There are free credits on signup to test the workflow. The 3D geometry is decent for props but can struggle with complex organic shapes—I'd appreciate any feedback on the mesh quality.
Thanks!