frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Why Fei-Fei Li and Yann LeCun Are Both Betting on "World Models"

https://entropytown.com/articles/2025-11-13-world-model-lecun-feifei-li/
56•signa11•1h ago

Comments

techblueberry•1h ago
I don’t know enough about this to be sure, but this feels like a white whale.
andrewflnr•1h ago
Human-level language was a white whale just a few years ago.
krainboltgreene•44m ago
A.L.I.C.E. was published in '95.
CrackerNews•1h ago
I think video and agentic and multimodal models have led to this point, but actually making a world model may provide to be long and difficult.

I feel LeCun is correct that LLMs as of now have limitations where it needs an architectural overhaul. LLMs now have a problem with context rot, and this would hamper with an effective world model if the world disintegrates and becomes incoherent and hallucinated over time.

It'd doubtful whether investors would be in for the long haul, which may explain the behavior of Sam Altman in seeking government support. The other approaches described in this article may be more investor friendly as there is a more immediate return with creating a 3D asset or a virtual simulation.

Fricken•50m ago
A trillion dollars are now riding on that white whale. An entire naval fleet is being raised for the purposes of chasing down that whale. LeCun and Fei-Fei merely believe that the whale is in a different ocean.
andrewflnr•1h ago
If I was smarter, I would have predicted that not only would everyone else figure out that world models are a critical step, but that as a direct consequence the term "world model" would lose all meaning. Maybe next time. That said, Le Cunn's concept in the blog post is the only one worthy of the title.
Marshferm•28m ago
Control theory and cog-sci are impaired ideas. There is no mind, and cog sci is a post hoc retrofit narrated onto brains, rather than experience as events integrated. Cog sci is words sportscasting synthetic categories.

LeCun's model will fail as the idea of world model is oxymoronic, brains don't need them and the world isn't modeled, all models are wrong, the world is experienced instantaneously in optic flow that's built atop of olfaction.

https://www.eneuro.org/content/7/4/ENEURO.0069-20.2020

Any real AI that veers at control will have to adopt a neurobio path

https://tbrnewsmedia.com/sbus-sima-mofakham-chuck-mikell-des...

That's built paradoxically from unpredictability

https://pubmed.ncbi.nlm.nih.gov/38579270/

MangoToupe•13m ago
All abstraction of reality are bound to fail, but some abstractions are more convincing (or indeed more useful) than others.

> Any real AI that veers at control will have to adopt a neurobio path

Maybe. Or maybe it's a useless distraction. Only time will tell what signals are meaningful.

Marshferm•7m ago
Neuro is the experience integrating allo/egocentric. We've already crossed that threshold in vision depth meets allocortex behaviors in entertainment. Ie there's more intelligence in The Shining than anything in current folk science AI/cog sci. It's a resounding flop, so will the Gaussian and the psychobabble of LeCuns as it is a psychological approach.
benatkin•1h ago
Whether or not this is exactly the same thing, I find this glossary entry from NVIDIA interesting: https://www.nvidia.com/en-us/glossary/world-models/
ChrisArchitect•55m ago
Earlier: https://news.ycombinator.com/item?id=45914363
IntrepidPig•52m ago
I always felt like one of reasons LLMs are so good is that they piggyback on the many years that have gone into developing language as an information representation/compression format. I don’t know if there’s anything similar a world model can take advantage of.

That being said there have been models which are pretty effective at other things that don’t use language, so maybe it’s a non issue.

ares623•30m ago
I will gladly take $10B to find out for you.
allenleee•50m ago
With all due respect, AI is ultimately a capital game. World models aren’t where real B2B customer revenue comes from—at least compared to today’s LLMs; they’re mainly a better story for raising huge amounts of private capital. Hopefully they figure out how to build the next-gen AI architecture along the way.
echelon•42m ago
The most useful models are image, video, and audio models. It makes sense that we'd make the video models more 4D aware.

Text really hogged all the attention. Media is where AI is really going to shine.

Some of the most profitable models right now are in music, image, and video generation. A lot of people are having a blast doing things they could legitimately never do before, and real working professionals are able to use the tools to get 1000x more done - perhaps providing a path to independence from bigger studios, and certainly more autonomy for those not born into nepotism.

As long as companies don't over-raise like OpenAI, there should be a smooth gradient from next gen media tools to revolutionary future stuff like immersive VR worlds that you can bend like the Matrix or Holodeck.

And I'll just be exceedingly chuffed if we get open source and highly capable world models from the Chinese that keep us within spitting distance of the unicorns.

Aperocky•13m ago
That just sounds like text with extra steps.

Fundamentally what AGI is trying to do is to encode ability to logic and reason. Tokens, images, video and audio are all just information of different entropy density that is the output of that logic reasoning process or emulation of logic reasoning process.

ryandv•4m ago
> Fundamentally what AGI is trying to do is to encode ability to logic and reason.

No? The Wason selection task has shown that logic and reason are not really core nor essential to human cognition.

It's really verging on speculation, but see chapter 2 of Jaynes 1976 - in particular the section on spatialization and the features of consciousness.

danielmarkbruce•12m ago
>> The most useful models are image, video, and audio models

This is wrong. The vast majority of revenue is being generated by text models because they are so useful.

MangoToupe•16m ago
> World models aren’t where real B2B customer revenue comes from

You could say the same thing about AGI. Ultimately capital will realize intelligence is a drawback.

philipkiely•49m ago
I played with Marble yesterday, Fei-Fei/World Labs' new product.

It is the most impressed I've been with an AI experience since the first time I saw a model one-shot material code.

Sure, its an early product. The visual output reminds me a lot of early SDXL. But just look at what's happened to video in the last year and image in the last three. The same thing is going to happen here, and fast, and I see the vision for generative worlds for everything from gaming/media to education to RL/simulation.

CrackerNews•36m ago
Marble appears to be like HunyuanWorld to me, but this time they marketed it as a first step to a world model, and it has multimodal capabilities.
IAmGraydon•42m ago
The LLM grift is burned up, so this is the next thing. It has just enough new magic tricks to wow the VCs who don't really get what's going on here. I think this comment from the article says it all:

“Taking images and turning them into 3D environments using gaussian splats, depth and inpainting. Cool, but that’s a 3D GS pipeline, not a robot brain.”

ares623•29m ago
It’s for the VCs who missed out early. Now’s their chance!
skywhopper•1m ago
Because they are smart enough to realize current LLM tech is nearing a dead end and cannot serve as a full AGI, even ignoring context and hallucination issues, without actual knowledge of the real world.