The slightly strange part is our small budget and how it was even formed. We come from several years of indie gamedev and game modding background. We’ve got mods with millions of downloads that have been covering a meaningful chunk of our cloud GPU costs. So, the budget constraint was the thing we’ve been building around this whole time.
Our first model (DF-World 0.1 Preview) is 1.5B parameters in size, generates video at 480p, supports real-time mouse+keyboard control in both first-person and third-person view, accepts mixed multimodal inputs (text, image, video), and lets you reprompt mid-stream to change the world on the fly. It runs in ~10-12 FPS on a single RTX 4090 in fp8 (~4GB VRAM), and even works on my RTX 2060 laptop, fully locally. DF-World 0.1 allows for infinite rollouts, though error accumulation changes the world too much after a minute of streaming.
As we couldn’t afford to train big models from scratch, we took LongLive 1 (autoregressive finetune of Wan 2.1 1.3B DiT) as the video backbone, then adopted the residual action module architecture from Matrix-Game 2 and trained it to work with dual-perspective control. We built a custom causal runtime layer on top of LongLive pipeline, added zero-shot multimodal conditioning, and also optimized heavily for low VRAM and consumer devices.
We still made some big strategic mistakes, for example, spent too long perfecting the video generation capabilities before video control was even established, and kept iterating/refining instead of shipping much earlier. Thankfully, all the great open-source releases kept bailing us out. The result is still clearly behind world models from frontier US/Chinese labs, but it’s a working real-time controllable loop, the first step in our world modelling attempt. For us, this feels like the natural next step from modding and gamedev, just instead of hand-building every world we made the model that generates them in real time.
Nevertheless, current limitations are obvious: DF-World 0.1 has weak spatial memory, error accumulation after ~1 minute, and latency that’s realtime but not comfortable/playable yet. But we are not claiming a breakthrough in the area of world models, so this is just the first visible loop of something we want to keep pushing further.
Right now, we are already building the next DF-World 0.5 model, and plan to open-source it once it’s less rough. We’d really love to hear some honest feedback on the current pre-recorded demos on our website (we couldn’t provide on-demand demos due to budget limitations), and your overall thoughts on DreamForge.