frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: X-Pilot – Code-Driven AI Video Generator for Online Courses

https://www.x-pilot.ai/
1•bianheshan•1h ago
Hi HN,

I'm Heshan, founder of X-Pilot. We're building an AI Video Generator for online courses and educational content. Unlike most text-to-video generator that render videos directly from models (which often produce random stock footage unrelated to the actual content), we take a code-first approach: generate editable code layers, let users verify/refine them, then render to video.

The Problem We're Solving

Most AI video generators treat "education" and "marketing" the same—they optimize for "looks good" rather than "logically accurate." When you feed a technical tutorial or course script into a generic video AI, you get: - Random B-roll that doesn't match the concept being explained - Incorrect visualizations (e.g., showing a "for loop" diagram when explaining recursion) - No way to systematically fix errors without regenerating everything

For educators, corporate trainers, and knowledge creators, accuracy matters more than aesthetics. A single incorrect diagram can break a learner's mental model.

Our Approach: Code as the Intermediate Layer

Instead of text → video blackbox, we do: Text/PDF/Doc → Structured Code (Remotion + Visual Box Engine) → Editable Preview → Final Render

Tech Stack - Agent orchestration: LangGraph (with Gemini 2.5 Flash for planning, reasoning, and content structuring) - Video Code generation model: Gemini3.0 for Remotion Code & Veo 3 (for generative footage where needed) - Code-based rendering: Remotion (React-based video framework) - Knowledge visualization engine: Our own "Visual Box Engine"—a library of parameterized educational animation components (flowcharts, comparisons, step-by-step sequences, system diagrams, etc.) - Voice synthesis: Fish Audio (for natural narration) - Rendering: Google Cloud (distributed video rendering using chrome headless) - Code execution sandbox: E2B (for safe, isolated code execution during generation and preview, but we will update to our own sandbox, because e2b offen time out,and low performance for bundle and render)

Why Remotion + Custom Components? We chose Remotion because: 1. Editability: Every visual element is React code. Users (or our AI agents) can modify text, swap components, adjust timing—without touching raw video files. 2. Reproducibility: Same input → same output. No model randomness in final render. 3. Composability: We built a "Visual Box" library—reusable animation patterns for education (e.g., "cause-and-effect flow," "comparison table," "hierarchical breakdown"). These aren't generic motion graphics; they're designed around pedagogical principles.

The trade-off: We sacrifice some "cinematic quality" for logical accuracy and user control. Right now, output can feel closer to "animated slides" than "documentary footage"—which is actually our biggest unsolved challenge (more on that below).

What We're Struggling With (and Planning to Fix)

1. Code Error Rate Generating Remotion code via LLMs is powerful but error-prone. 2. Limited Asset Handling Right now, if a user wants to insert a custom image/GIF/video mid-generation, they need to upload → we process → regenerate. This breaks flow. 3. The "PPT Feel" Problem This is the hardest one. Because we prioritize structure and editability, our videos can feel like "animated PowerPoint" rather than "produced content."

We're experimenting with: - Hybrid rendering: Use generative video (Veo) for transitions/B-roll, but keep Visual Boxes for core explanations - Cinematic presets: Camera movements, depth effects, color grading—applied as composable layers - Motion design constraints: Teaching our agent to follow motion design principles (easing curves, visual hierarchy, pacing)

Honest question for HN: Has anyone solved this trade-off between "programmatically editable" and "cinematic quality"? I'd love to hear how others have approached it (especially in contexts where correctness > vibes).

Comments

bianheshan•1h ago
OP here. A few additional technical details folks might be curious about:

- Why Gemini over GPT-5/Claude4.5 for agent orchestration: Gemini3.0 is better for react code.

- Visual Box Engine specifics: ~300 parameterized animation templates. Each "box" is a React component with props like {concept, relationships, emphasis, timing}. Example: "CauseEffectFlow" takes an array of steps and auto-generates animated arrows + state transitions.

- E2B sandboxing: We run Remotion preview renders in isolated environments. This prevents malicious/buggy code from affecting other users' jobs.

Happy to answer questions about any part of the stack!

Scott Bessent calls Denmark "irrelevant", is not concerned by Treasury sell-off

https://www.cnbc.com/2026/01/21/bessent-davos-denmark-greenland-treasuries.html
1•maxloh•1m ago•0 comments

100x a Business with AI

https://twitter.com/vasuman/status/2010473638110363839
1•gmays•1m ago•0 comments

libcurl memory use some years later

https://daniel.haxx.se/blog/2026/01/21/libcurl-memory-use-some-years-later/
1•TangerineDream•3m ago•0 comments

The Oligarchs Pushing for Conquest in Greenland

https://newrepublic.com/article/205102/oligarchs-pushing-conquest-greenland-trump
1•afavour•4m ago•0 comments

The Confabulations of Oliver Sacks

https://nautil.us/the-confabulations-of-oliver-sacks-1262447/
2•bookofjoe•5m ago•1 comments

Cognitive Collapse: A First Reconnaissance

https://www.ecosophia.net/cognitive-collapse-a-first-reconnaissance/
1•bediger4000•5m ago•0 comments

Alex Honnold did a trial climb up 101 today. Thoughts?

https://old.reddit.com/r/Taipei/comments/1qhxtk7/alex_honnold_did_a_trial_climb_up_101_today/
2•keepamovin•5m ago•0 comments

Show HN: AI 3D Camera:Transform Any Photo into a Professional Photography Studio

https://ai3dcamera.com/
1•dond1986•6m ago•0 comments

Show HN: See the carbon impact of your cloud as you code

1•hkh•9m ago•0 comments

Agentic AI and the Mythical Agent-Month

http://muratbuffalo.blogspot.com/2026/01/agentic-ai-and-mythical-agent-month.html
1•vinhnx•11m ago•0 comments

Tree CLI's plain text secrets

https://w.willx86.com/2026/01/21/tree-secrets.html
1•willx86•12m ago•0 comments

Memory supply shortfall will cause chip shortage to spread to other segments

https://www.tomshardware.com/pc-components/ram/data-centers-will-consume-70-percent-of-memory-chi...
2•walterbell•12m ago•0 comments

A Lifetime of Service

https://olly.world/a-lifetime-of-service
2•lylo•12m ago•1 comments

Show HN: An open source "Cursor for Google Sheets" with conversation memory

https://github.com/Ai-Quill/ai-sheeter
1•tuantruong•12m ago•0 comments

GongU

https://gongu.xyz
1•dwk601•13m ago•0 comments

So, why *should* GNOME support server side decorations?

https://blister.zip/posts/gnome-ssd/
1•todsacerdoti•13m ago•0 comments

YC Spring – Full-Stack AI Consulting Company

1•latmba06•13m ago•0 comments

Computational model discovers new types of neurons hidden in decade-old dataset

https://bigthink.com/neuropsych/computational-model-discovers-new-types-of-neurons-hidden-in-deca...
1•Brajeshwar•15m ago•0 comments

Webb reveals a planetary nebula with clarity, and it is spectacular

https://arstechnica.com/space/2026/01/webb-has-given-us-with-a-stunning-new-view-of-a-well-known-...
2•Brajeshwar•15m ago•0 comments

From Veritasium: What If You Keep Slowing Down?

https://www.media.mit.edu/articles/veritasium-what-if-you-keep-slowing-down/
1•Brajeshwar•15m ago•0 comments

Amazon shopping automation without vision: verification gate+local model (3B)

1•tonyww•16m ago•1 comments

Tell HN: The FAA is pushing to decimate small flight schools

2•salusinarduis•17m ago•2 comments

Show HN: Everpath – Goal planning via natural language roadmaps

https://www.everpath.app/
1•sweave•17m ago•0 comments

DOGE Employees Shared Social Security Data

https://www.nytimes.com/2026/01/20/us/politics/doge-employees-social-security-data.html
3•insane_dreamer•18m ago•1 comments

Show HN: ChartGPU – WebGPU-powered charting library (1M points at 60fps)

https://github.com/ChartGPU/ChartGPU
7•huntergemmer•18m ago•2 comments

Proof That Agentic AI Scales (For Creating Broken Software)

https://codemanship.wordpress.com/2026/01/21/finally-proof-that-agentic-ai-scales-for-creating-br...
1•flail•20m ago•0 comments

Show HN: Lingoku – Learn Japanese with DeepSeek/Ollama (Updated)

3•englishcat•23m ago•0 comments

Avnawards

https://www.facebook.com/events/virgin-hotels-las-vegas/how-to-watch-2026-avn-awards-show-live-fo...
1•notgoodme•24m ago•0 comments

Show HN: UltraContext – A simple context API for AI agents with auto-versioning

https://ultracontext.ai/
6•ofabioroma•25m ago•4 comments

How to keep AI-written code aligned (without repeating yourself)

https://linggen.dev/wiki/2026-01-19-wiki-design-intent
1•linggen•28m ago•1 comments