> The code is being prepared for public release; pretrained weights and full training/inference pipelines are planned.
Any ideas of how it would different and better compared to "traditional" PCG? Seems like it'd give you more resource consumption, worse results and less control, neither of which seem like a benefit.
> We tackle the challenge of generating the infinitely extendable 3D world — large, continuous environments with coherent geometry and realistic appearance. Existing methods face key challenges: 2D-lifting approaches suffer from geometric and appearance inconsistencies across views, 3D implicit representations are hard to scale up, and current 3D foundation models are mostly object-centric, limiting their applicability to scene-level generation. Our key insight is leveraging strong generation priors from pre-trained 3D models for structured scene block generation. To this end, we propose WorldGrow, a hierarchical framework for unbounded 3D scene synthesis. Our method features three core components: (1) a data curation pipeline that extracts high-quality scene blocks for training, making the 3D structured latent representations suitable for scene generation; (2) a 3D block inpainting mechanism that enables context-aware scene extension; and (3) a coarse-to-fine generation strategy that ensures both global layout plausibility and local geometric/textural fidelity. Evaluated on the large-scale 3D-FRONT dataset, WorldGrow achieves SOTA performance in geometry reconstruction, while uniquely supporting infinite scene generation with photorealistic and structurally consistent outputs. These results highlight its capability for constructing large-scale virtual environments and potential for building future world models.
It's about generating interesting virtual space!
I know 'interesting' is subjective, but your comment is demonstrably false. Just type "mario 64 staircase" into youtube, and look at the hundreds (thousands? millions?) of videos and many millions of views.
Maybe the idea is to create environments for AI robotics traini ng.
Consider the patterns generated by cellular automata.
Both tend to stay interesting in the small scale but lose it to boring chaos in the large.
For this reason I think the better approach is to start with a simple level-scale form and then refine it into smaller parts, and then to refine those parts and so on.
(Vs plugging away at tunnel-building like a mole)
And Valve I think used to have a series on level design, involving from big to small and "anchor points", but I seem to have misplaced the link.
I've dreamed of a NeRF-powered backrooms walking simulator for quite a while now. This approach is "worse" because the mesh seems explicit rather than just the world becoming what you look at, but that's arguably better for real-world use cases of course.
True, it sounds (and looks) a lot like https://scp-wiki.wikidot.com/scp-3008
jackdoe•2h ago
speedgoose•1h ago
pjmlp•1h ago