frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

I built a small Sora-style video generator as a side experiment

https://saro2.ai
2•kelly99•2mo ago

Comments

kelly99•2mo ago
Hey HN,

I’ve been spending the past month diving into AI video generation — not just using models, but trying to understand the actual constraints behind them. After prototyping a small Sora-style generator on my own, I started to notice a few deeper patterns about the industry that I wanted to share and get feedback on.

1. AI video tools aren’t limited by “models”

Most of the friction today isn’t about model quality:

region-locked access

invite-only rollouts

heavy watermarking

friction in basic usage

short duration limits

no multi-scene support

pricing opaque or unsuitable for small creators

The technology is improving fast — but the accessibility layer hasn’t caught up.

This is why the majority of creators (especially small merchants, indie filmmakers, TikTok sellers, UGC creators) still can’t practically adopt AI video at scale.

2. Multi-scene generation is the “real moat”

Most models can do a single beautiful 2-4 second shot.

But real use cases — ads, storytelling, product demos — need:

shot transitions

visual consistency

character identity retention

stable camera paths

narrative structure

The real challenge is not “make a clip”, but “make a sequence”.

That’s where pipelines, not models, matter.

3. The real bottleneck is temporal coherence

From my experiments, the hardest problems aren’t fancy effects — they’re the boring ones:

slight drift in character identity

physics mismatch between shots

exposure shifts

motion jitter at boundaries

model choosing different “interpretations” each time

There’s no perfect solution yet. Some combination of:

prompt redistribution

style anchors

conditioning

intermediate frames

shot graphs

works “okay”,but there’s huge open research space.

4. Small creators care less about model elegance — more about “does it work for my product?”

This surprised me.

I talked to some merchants and small creators. What they wanted wasn’t:

“best model”

“highest fidelity”

“latest architecture”

They asked for:

no watermark

9:16 format

product-handheld shots

consistent 20–25s video

don’t make me wait

just give me something I can post today

It’s a very different set of priorities than what model researchers focus on.

5. The infra is the unsung hero

Most public discussions focus on models, but from building my prototype I realized:

async queues

model switching

fallback logic

caching policies

GPU scheduling

latency constraints

matter far more for practical AI video creation than architecture diagrams.

Without good infra, even the best models feel unusable.

A prototype I built while exploring these ideas

As a way to understand these bottlenecks more concretely, I built a small prototype called Saro2.ai — basically an experiment in:

10s cinematic clip generation

25s multi-scene “storyboard” generation

attempts at shot consistency

simple scene → shot graph

a multi-model backend with light scheduling

It requires login (to control compute use), but I’m mainly sharing it as an example of the things I’m testing, not trying to “launch a product”.

Here’s the link if anyone wants to see how it behaves: https://saro2.ai/

What I’m hoping to learn

If you’ve worked on:

temporal modeling

multi-scene pipelines

conditioning

generative video infra

shot consistency strategies

I’d love to hear your perspective.

Especially curious about:

what people think the real frontier is

what “must solve” engineering problems exist before AI video is truly usable

whether multi-scene consistency is solvable with heuristics or requires new architectures

Happy to share more details about the pipeline or what didn’t work.

Thanks for reading — and I’d appreciate any thoughts from people working in (or following) this space.

I'm 15 and built a free tool for reading Greek/Latin texts. Would love feedback

https://the-lexicon-project.netlify.app/
1•breadwithjam•1m ago•1 comments

How close is AI to taking my job?

https://epoch.ai/gradient-updates/how-close-is-ai-to-taking-my-job
1•cjbarber•1m ago•0 comments

You are the reason I am not reviewing this PR

https://github.com/NixOS/nixpkgs/pull/479442
2•midzer•3m ago•0 comments

Show HN: FamilyMemories.video – Turn static old photos into 5s AI videos

https://familymemories.video
1•tareq_•5m ago•0 comments

How Meta Made Linux a Planet-Scale Load Balancer

https://softwarefrontier.substack.com/p/how-meta-turned-the-linux-kernel
1•CortexFlow•5m ago•0 comments

A Turing Test for AI Coding

https://t-cadet.github.io/programming-wisdom/#2026-02-06-a-turing-test-for-ai-coding
2•phi-system•5m ago•0 comments

How to Identify and Eliminate Unused AWS Resources

https://medium.com/@vkelk/how-to-identify-and-eliminate-unused-aws-resources-b0e2040b4de8
2•vkelk•6m ago•0 comments

A2CDVI – HDMI output from from the Apple IIc's digital video output connector

https://github.com/MrTechGadget/A2C_DVI_SMD
1•mmoogle•6m ago•0 comments

CLI for Common Playwright Actions

https://github.com/microsoft/playwright-cli
3•saikatsg•8m ago•0 comments

Would you use an e-commerce platform that shares transaction fees with users?

https://moondala.one/
2•HamoodBahzar•9m ago•1 comments

Show HN: SafeClaw – a way to manage multiple Claude Code instances in containers

https://github.com/ykdojo/safeclaw
2•ykdojo•12m ago•0 comments

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment-blog-3
3•gmays•13m ago•0 comments

The Evolution of the Interface

https://www.asktog.com/columns/038MacUITrends.html
2•dhruv3006•14m ago•1 comments

Azure: Virtual network routing appliance overview

https://learn.microsoft.com/en-us/azure/virtual-network/virtual-network-routing-appliance-overview
2•mariuz•15m ago•0 comments

Seedance2 – multi-shot AI video generation

https://www.genstory.app/story-template/seedance2-ai-story-generator
2•RyanMu•18m ago•1 comments

Πfs – The Data-Free Filesystem

https://github.com/philipl/pifs
2•ravenical•21m ago•0 comments

Go-busybox: A sandboxable port of busybox for AI agents

https://github.com/rcarmo/go-busybox
3•rcarmo•22m ago•0 comments

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

https://research.nvidia.com/labs/nemotron/files/NVFP4-QAD-Report.pdf
2•gmays•23m ago•0 comments

xAI Merger Poses Bigger Threat to OpenAI, Anthropic

https://www.bloomberg.com/news/newsletters/2026-02-03/musk-s-xai-merger-poses-bigger-threat-to-op...
2•andsoitis•23m ago•0 comments

Atlas Airborne (Boston Dynamics and RAI Institute) [video]

https://www.youtube.com/watch?v=UNorxwlZlFk
2•lysace•24m ago•0 comments

Zen Tools

http://postmake.io/zen-list
2•Malfunction92•26m ago•0 comments

Is the Detachment in the Room? – Agents, Cruelty, and Empathy

https://hailey.at/posts/3mear2n7v3k2r
2•carnevalem•27m ago•1 comments

The purpose of Continuous Integration is to fail

https://blog.nix-ci.com/post/2026-02-05_the-purpose-of-ci-is-to-fail
1•zdw•29m ago•0 comments

Apfelstrudel: Live coding music environment with AI agent chat

https://github.com/rcarmo/apfelstrudel
2•rcarmo•30m ago•0 comments

What Is Stoicism?

https://stoacentral.com/guides/what-is-stoicism
3•0xmattf•31m ago•0 comments

What happens when a neighborhood is built around a farm

https://grist.org/cities/what-happens-when-a-neighborhood-is-built-around-a-farm/
1•Brajeshwar•31m ago•0 comments

Every major galaxy is speeding away from the Milky Way, except one

https://www.livescience.com/space/cosmology/every-major-galaxy-is-speeding-away-from-the-milky-wa...
3•Brajeshwar•31m ago•0 comments

Extreme Inequality Presages the Revolt Against It

https://www.noemamag.com/extreme-inequality-presages-the-revolt-against-it/
2•Brajeshwar•31m ago•0 comments

There's no such thing as "tech" (Ten years later)

1•dtjb•32m ago•0 comments

What Really Killed Flash Player: A Six-Year Campaign of Deliberate Platform Work

https://medium.com/@aglaforge/what-really-killed-flash-player-a-six-year-campaign-of-deliberate-p...
1•jbegley•32m ago•0 comments