https://uk.pcmag.com/ai/165970/meta-exploring-option-to-sell...
Meta bought too many GPUs, has spare GPU capacity and they are exploring renting that capacity out.
The problem is not that the models need too much to do the job. If that were the case, Meta would not have spare capacity.
The problem is that the models currently can't be made to do the job.
I agree that people are investing as though the world is going to run itself while the ultra-wealthy run off in yachts to compare sizes. If it wasn't AI, it would just be tulips or something. That's just how people are. But maybe they'll be right, who knows.
Maybe Wang has correctly identified that the programming and agentic ability that Anthropic and OpenAI models have has largely come from armies of software engineers creating massive datasets by writing out coding and agentic problems and solutions?
So he told Zuckerberg that. The reason it may be turning into so much friction is that at companies like Anthropic or OpenAI, training engineers were either hired specifically for that purpose or probably mostly handled through contracts with third parties (which again, hired them to train AI). And honestly many of them may be overseas or just happy to have a job in a difficult period. But anyway they wouldn't have very high salary expectations etc.
But Zuckerberg already had 25000 engineers. Why not take say 1/5 of them and get them working on the the dataset? The problem is that those engineers were hired for different prestigious highly paid positions at Meta/Facebook. They were not hired to do tedious grading of AI answers or quiz construction.
But Zuckerberg either has to do this, or spend additional billions on doing it all with external contractors. A third option would be to try to create a massive distillation operation. Or just hope that his engineers could invent some magical new training trick that manifested the agentic and programming skills without the large scale human input.
Or he could release a model trained largely by existing open weights models. Which without some huge breakthrough probably has no chance of surpassing them, so is pointless.
I think most of the substantive criticism of Zuckerberg has been about burning funds. If he gives up the "your job is to grade AI homework now" plan because his engineers refuse, he would need to go through third parties. The additional billions and billions this would cost would create more pressure on the bottom line and shareholder pressure.
It would also give up any potential advantage that Wang may have optimistically sold the operation as, on that using "real" engineers as opposed to lower paid data labelling engineers might result in a higher quality dataset.
At some point, model architectures that don't need such massive datasets or can be created automatically in a way that advances the frontier will probably come about. But right now it doesn't exist.
Further, the way AI works currently, business advantage from AI comes from encoding existing internal intelligence and knowledge. Meta's massive engineering corp effectively has that in their heads. Having them create these datasets is possibly the only way to leverage this knowledge asset in this paradigm.
I guess the problem is it means forcing thousands of people to do a different job from the one they were hired for.
The whole hype cycle has been pure delusion. Just like the Metaverse hype cycle before it.
Meta doesn't seem to be able to produce anything close to a frontier model. The selling of compute capacity seems to be acceptance of "compute is wasted on this crappy avocado model, we'd be better off allowing something better to run".
The problem is clearly in the model architecture, the training and the data fed into the model which is causing them to give up on using their compute exclusively for their own models. They can't get it right so may as well sell the compute to someone that can.
Feels less like the pace of foundation model development and more so a specific failure of one organization to do something important.
penpendian•1h ago
yepyoukno•1h ago
The modern trend is to think intelligence is generative “like compression” or “predicting next in sequence” rather than iteratively reducing uncertainty, like those fault tolerant humans.
AnotherGoodName•45m ago
No one ever in comp sci says artificial intelligence is "like compression", they correctly state that "artificial intelligence IS compression". It's absolutely known and accepted that artificial intelligence (defined as predicting outcomes with a measure of certainty and taking chosen actions towards goals using those predictions) has equivalence to compression in a very hard science way. The hardest part of artificial intelligence is compression and the remaining part, the choice of actions based on predictions is just a tree search to a goal.