"To improve autoregressive stability for this research preview, what we’re sharing today can be considered a narrow distribution model: it's pre-trained on video of the world, and post-trained on video from a smaller set of places with dense coverage. The tradeoff of this post-training is that we lose some generality, but gain more stable, long-running autoregressive generation."
I wonder if it'd break our brains more if the environment changes as the viewpoint changes, but doesn't change back (e.g. if there's a horse, you pan left, pan back right, and the horse is now a tiger).
The resolution is extremely low. The website doesn't specify, but I'd guess it's only 160x120. Such a low resolution was necessary to render it in real time and maintain a reasonable frame rate. To try to hide the blurring a bit, they apply some filters to add scan lines and other effects to make it look like an old TV.
That said, I'd be surprised if anybody could gather the hardware to work well enough to get it to a useable resolution, let alone even something like 1080p. It's literally over 100x the pixels of 160x120.
bkmeneguello•1d ago