Meta Superintelligence's surprising first paper

https://paddedinputs.substack.com/p/meta-superintelligences-surprising

93•skadamat•2h ago

https://arxiv.org/abs/2509.01092

Comments

bigyabai•2h ago

> Long awaited first paper from Meta Superintelligence Labs is not a model layer innovation. What does this mean?

It means you're reading into it too much and need to be let down, gently, from the hype train.

nine_k•2h ago

A great post, it starts with this:

TL;DR

• MSI’s first paper, REFRAG, is about a new way to do RAG.

• This slightly modified LLM converts most retrieved document chunks into compact, LLM-aligned chunk embeddings that the LLM can consume directly.

• A lightweight policy (trained with RL) decides which chunk embeddings should be expanded back into full tokens under a budget; the LLM runs normally on this mixed input.

• The net effect is far less KV cache and attention cost, much faster first-byte latency and higher throughput, while preserving perplexity and task accuracy in benchmarks.

I wish more long posts followed this model of a scientific paper.

jongjong•2h ago

Interesting. All developers I know who tinkered around with embeddings and vector similarity scoring were instantly hooked. The efficiency of computing the embeddings once and then reusing as many times as needed, comparing the vectors with a cheap <30-line function is extremely appealing. Not to mention the indexing capabilities to make it work at scale.

IMO vector embedding is the most important innovation in computing of the last decade. There's something magical about it. These people deserve some kind of prize. The idea that you can reduce almost any intricate concept including whole paragraphs to a fixed-size vector which encapsulates its meaning and proximity to other concepts across a large number of dimensions is pure genius.

_jayhack_•1h ago

Vector embedding is not an invention of the last decade. Featurization in ML goes back to the 60s - even deep learning-based featurization is decades old at a minimum. Like everything else in ML this became much more useful with data and compute scale

senderista•1h ago

Yup, when I was at MSFT 20 years ago they were already productizing vector embedding of documents and queries (LSI).

ekidd•1h ago

Vector embeddings are slightly interesting because they come pre-trained with large amounts of data.

But similar ways to reduce huge numbers of dimensions to a much smaller set of "interesting" dimensions have been known for a long time.

Examples include principal component analysis/single value decomposition, which was the first big breakthrough in face recognition (in the early 90s), and also used in latent semantic indexing, the Netflix prize, and a large pile of other things. And the underlying technique was invented in 1901.

Dimensionality reduction is cool, and vector embedding is definitely an interesting way to do it (at significant computational cost).

liampulles•1h ago

If you take the embedding for king, subtract the embedding for male, add the embedding for female, and lookup the closest embedding you get queen.

The fact that dot product addition can encode the concept of royalty and gender (among all other sorts) is kind of magic to me.

mountainriver•1h ago

This was a very obvious next step, I played around with implementing something similar at one point.

In general we need to make it simpler for LLMs to take in different forms of embeddings. At least frameworks that simplify it.

cm2012•1h ago

At first I thought the super intelligence wrote a novel scientific paper

Imnimo•1h ago

I'm curious whether this is work that was specifically begun under the "superintelligence" umbrella, or if it's just that the people who were working on it had been shifted to the Superintelligence team by the time they wrote the paper. I would guess the former?

naasking•59m ago

> the core insight here is actually: if embeddings are generated by layers within the LLM, it makes no sense to convert them back to natural language, just for another LLM to compress those tokens back to embeddings.

Doesn't this tie the two layers together in a way that they can't evolve separately?

xvector•57m ago

Working in big tech it's pretty wild to see how integral AI has become to our work internally, vs the public perception of it. People are NOT prepared.

fishmicrowaver•53m ago

Not prepared for what? Seems like the rest of the world is desperate to be shown the way to unlock something of value?

Workaccount2•38m ago

I think at this point it's software devs looking for the value unlock.

Non-software devs are actually making functional programs for themselves for the first time ever. The value is crazy.

ceejayoz•30m ago

It’s not the first time ever. People did the same with Access and HyperCard in the 90s.

fishmicrowaver•17m ago

Sure but I'm the real world do you think businesses are going to deploy piles of code into production generated this way? No, non technical people will continue to whip up MS PowerApps. AI generated code has no value to many businesses.

terminalshort•52m ago

1. Hyperbolic statement about LLM capabilities with no concrete examples

2. Wild claim that the companies that sell LLMs are actually downplaying their capabilities instead of hyping them

danielmarkbruce•17m ago

Yup, he's totally lying. Not happening. Just carry on.

BoorishBears•12m ago

Agreed, but why are they lying?

godelski•53m ago

It's kinda funny, Meta has long had some of the best in the field, but left them untapped. I really think if they just took a step back and stop being so metric focused and let their people freely explore then they'd be winning the AI race. But with this new team, I feel like meta mostly hired the people who are really good at gaming the system. The people that care more about the money than the research.

A bit of this is true at every major lab. There's tons of untapped potential. But these organizations are very risk adverse. I mean why not continue with the strategy that got us to the point we're at in the first place. Labs used to hire researchers and give them a lot of free reign. But those times ended and AI progress also slowed down. Maybe if you want to get ahead you gotta stop thinking like everyone else

Well meta... you can "hold me hostage" for a lot cheaper than those guys. I'm sure this is true for hundreds of passionate ML researchers. I'd take a huge pay cut to have autonomy and resources. I know for a fact there's many working at Mets right now that would do the same. Do maybe if you're going to throw money at the problem, diversify a bit and look back at what made SV what it is today and what made AI take leaps forward

ipsum2•48m ago

This has nothing to do with superintelligence, it's just the people that were working on the paper prior to the re-org happened to publish after the name change.

Though it is notable that contrary to many (on HN and Twitter) that Meta would stop publishing papers and be like other AI labs (e.g. OpenAI). They're continued their rapid pace of releasing papers AND open source models.

bigcat12345678•28m ago

https://docs.lamini.ai/memory_rag/ Similar approaches have been tried before already

pppoe•23m ago

I find it absurd that, compared to the past, large companies now have more abundant stock prices and cash than ever before, yet nearly every AI Lab in these companies is facing greater pressure than ever, being asked to generate short-term profits. In the midst of AI's unprecedented boom, the research environment and atmosphere in the industry seem to have worsened compared to the past.

signatoremo•14m ago

Is this Meta’s lab pressured to generate short term profits?

Which other under pressure labs are you talking about?

foldl2022•22m ago

So, show me the model weights, please.

yalogin•14m ago

I am not surprised because the culture at meta is not at all, even in the slightest, to focus on science for the sake of it. It’s actively actively purged out of you. The focus is on metrics and how the bottom line is impacted. So this is in line with that

Labels in HTML

Israeli Spyware Vendor NSO Group Reportedly Sells to U.S. Company

Africa's oldest leader isn't ready to retire; he's not the only one defying age

Ed Davey urges regulator to go after Musk over X 'crimes'

Why Can't Fashion See What It Does to Women?

I've Gone to Look for America

A GenZ billionaire:Shayne Coplan figured out a society that gambles on everythin

Before Haskell, there was the Orwell programming language [pdf]

BillionToOne IPO

China's New Rare Earth and Magnet Restrictions Threaten US Defense Supply Chains

Trump Administration Lays Off CDC Officials

Gitid: Nvm for Git identities, now supports per-repo identity

FOXP3 and scurfy: how it all began

Bilingual Hebrew Liturgical Songs Sung Beautifully

Power-over-Fiber

People on the far-right and far-left exhibit strikingly similar brain responses

Terminal Lucidity: Envisioning the Future of the Terminal

How Google Is Walking the AI Tightrope

Susam Pal: My Lobsters Interview

Study Traces Autism's Origin to the Rise of Human Intelligence

The 'profound' global impact of China's rise as an electrostate

What do coyotes think?

ROSA+: RWKV's ROSA implementation with fallback statistical predictor

Marc Benioff Says Trump Should Deploy National Guard to San Francisco

Putin Has a New Tool to Monitor Russians

1990s Millport CNC Vertical Mill Revival

Chinese EV giant BYD sees UK sales soar by 880%

The Rise of 'Conspiracy Physics'

The uranium plant at the center of U.S. plans to expand nuclear power

OpenRouter drops fees in response to Vercel's AI Gateway