Show HN: X-Pilot – Code-Driven AI Video Generator for Online Courses

https://www.x-pilot.ai/

1•bianheshan•2w ago

Hi HN,

I'm Heshan, founder of X-Pilot. We're building an AI Video Generator for online courses and educational content. Unlike most text-to-video generator that render videos directly from models (which often produce random stock footage unrelated to the actual content), we take a code-first approach: generate editable code layers, let users verify/refine them, then render to video.

The Problem We're Solving

Most AI video generators treat "education" and "marketing" the same—they optimize for "looks good" rather than "logically accurate." When you feed a technical tutorial or course script into a generic video AI, you get: - Random B-roll that doesn't match the concept being explained - Incorrect visualizations (e.g., showing a "for loop" diagram when explaining recursion) - No way to systematically fix errors without regenerating everything

For educators, corporate trainers, and knowledge creators, accuracy matters more than aesthetics. A single incorrect diagram can break a learner's mental model.

Our Approach: Code as the Intermediate Layer

Instead of text → video blackbox, we do: Text/PDF/Doc → Structured Code (Remotion + Visual Box Engine) → Editable Preview → Final Render

Tech Stack - Agent orchestration: LangGraph (with Gemini 2.5 Flash for planning, reasoning, and content structuring) - Video Code generation model: Gemini3.0 for Remotion Code & Veo 3 (for generative footage where needed) - Code-based rendering: Remotion (React-based video framework) - Knowledge visualization engine: Our own "Visual Box Engine"—a library of parameterized educational animation components (flowcharts, comparisons, step-by-step sequences, system diagrams, etc.) - Voice synthesis: Fish Audio (for natural narration) - Rendering: Google Cloud (distributed video rendering using chrome headless) - Code execution sandbox: E2B (for safe, isolated code execution during generation and preview， but we will update to our own sandbox， because e2b offen time out，and low performance for bundle and render)

Why Remotion + Custom Components？ We chose Remotion because: 1. Editability: Every visual element is React code. Users (or our AI agents) can modify text, swap components, adjust timing—without touching raw video files. 2. Reproducibility: Same input → same output. No model randomness in final render. 3. Composability: We built a "Visual Box" library—reusable animation patterns for education (e.g., "cause-and-effect flow," "comparison table," "hierarchical breakdown"). These aren't generic motion graphics; they're designed around pedagogical principles.

The trade-off: We sacrifice some "cinematic quality" for logical accuracy and user control. Right now, output can feel closer to "animated slides" than "documentary footage"—which is actually our biggest unsolved challenge (more on that below).

What We're Struggling With (and Planning to Fix)

1. Code Error Rate Generating Remotion code via LLMs is powerful but error-prone. 2. Limited Asset Handling Right now, if a user wants to insert a custom image/GIF/video mid-generation, they need to upload → we process → regenerate. This breaks flow. 3. The "PPT Feel" Problem This is the hardest one. Because we prioritize structure and editability, our videos can feel like "animated PowerPoint" rather than "produced content."

We're experimenting with: - Hybrid rendering: Use generative video (Veo) for transitions/B-roll, but keep Visual Boxes for core explanations - Cinematic presets: Camera movements, depth effects, color grading—applied as composable layers - Motion design constraints: Teaching our agent to follow motion design principles (easing curves, visual hierarchy, pacing)

Honest question for HN: Has anyone solved this trade-off between "programmatically editable" and "cinematic quality"? I'd love to hear how others have approached it (especially in contexts where correctness > vibes).

Comments

bianheshan•2w ago

OP here. A few additional technical details folks might be curious about:

- Why Gemini over GPT-5/Claude4.5 for agent orchestration: Gemini3.0 is better for react code.

- Visual Box Engine specifics: ~300 parameterized animation templates. Each "box" is a React component with props like {concept, relationships, emphasis, timing}. Example: "CauseEffectFlow" takes an array of steps and auto-generates animated arrows + state transitions.

- E2B sandboxing: We run Remotion preview renders in isolated environments. This prevents malicious/buggy code from affecting other users' jobs.

Happy to answer questions about any part of the stack!

AI readability score for your documentation

NASA Study: Non-Biologic Processes Don't Explain Mars Organics

I inhaled traffic fumes to find out where air pollution goes in my body

X said it would give $1M to a user who had previously shared racist posts

155M US land parcel boundaries

Private Inference

Font Rendering from First Principles

Show HN: Seedance 2.0 AI video generator for creators and ecommerce

Wally: A fun, reliable voice assistant in the shape of a penguin

Rewriting Pycparser with the Help of an LLM

Lobsters Vibecoding Challenge

E-Commerce vs. Social Commerce

Avoiding Modern C++ – Anton Mikhailov [video]

Show HN: AegisMind–AI system with 12 brain regions modeled on human neuroscience

Zig – Package Management Workflow Enhancements

AI-powered text correction for macOS

AppSecMaster – Learn Application Security with hands on challenges

Fibonacci Number Certificates

AI Overviews are killing the web search, and there's nothing we can do about it

City skylines need an upgrade in the face of climate stress

1979: The Model World of Robert Symes [video]

Satellites Have a Lot of Room

1980s Farm Crisis

Show HN: FSID - Identifier for files and directories (like ISBN for Books)

Show HN: Holy Grail: Open-Source Autonomous Development Agent

Show HN: Minecraft Creeper meets 90s Tamagotchi

Show HN: Termiteam – Control center for multiple AI agent terminals

The only U.S. particle collider shuts down

Ask HN: Why do purchased B2B email lists still have such poor deliverability?

Show HN: Remotion directory (videos and prompts)