Muse Spark – Meta Superintelligence Labs

https://meta.ai/

162•snowman647•1h ago

https://ai.meta.com/blog/introducing-muse-spark-msl/

Comments

babelfish•1h ago

Probably a better link: https://ai.meta.com/blog/introducing-muse-spark-msl/

dang•1h ago

I've put that link in the top text - thanks!

warthog•1h ago

Hoping the benchmarks are correct this time...

htrp•1h ago

Anyone done vibe testing at meta ai yet?

zurfer•1h ago

> Muse Spark is available today at meta.ai and the Meta AI app. We’re opening a private API preview to select users.

m4r1k•1h ago

So no Open-weight .. why one would choose Muse Spark instead of Anthropic, OpenAI, or Google models all featuring from good to amazing harness?

Artgor•1h ago

I'm cautiously waiting for the feedback from the first users. Meta has produced a lot of great models (LLama), maybe this is a comeback... but I'm cautious, as the jump in the quality is almost too high.

Also, I think people aren't used that using such models requires meta.ai or meta ai app.

solenoid0937•1h ago

My Meta friends say it's benchmaxxed af

loeg•50m ago

We used to call this "overfitting," but I suppose everything has to be maxxed now. Fitmaxxed?

conradkay•1h ago

It doesn't seem benchmaxxed, ARC AGI 2 score is quite bad (42.5%, GPT 5.4 is 76.1%) and coding is okay. But maybe this is the best Meta can do even benchmaxxing

The impressive part is multimodality, very plausible since there's less focus there by other labs (especially Anthropic)

khalic•1h ago

Oh good, if they built a lab, I’m sure they took the time the precisely define what they mean by super intelligence? Right? …

52-6F-62•1h ago

If this is super intelligence, then it follows we must all be super-duper intelligence.

gallerdude•1h ago

This would have been an amazing release 6 months ago. But the industry moves so fast, this is a trite release. Maybe it’s best for Meta to sell their superintelligence division. I don’t think Zuck’s vision is particularly compelling.

gordonhart•1h ago

A new model comparable (ish) to the Claude/Gemini/GPT flagships is a big deal for the industry and for Meta even if it doesn't set the new frontier.

zozbot234•1h ago

Their new Contemplating mode gives this model a Deep Research ability (akin to existing models from GPT and Gemini) that might make it quite comparable to the just-announced Mythos.

solenoid0937•1h ago

Mythos is a much bigger pre train, Contemplating is not the same thing.

zozbot234•1h ago

> Mythos is a much bigger pre train

Do we have data to substantiate that claim?

solenoid0937•57m ago

It's pretty common knowledge. Spud is the only other PT comparable with Mythos.

Both Spud and Mythos can also scale via inference time compute.

Meta simply did not have enough compute online, long enough ago, to have a similar PT.

gallerdude•1h ago

I’m not sure. If it was open source, certainly. But 4th place doesn’t really matter if you have nothing different to add.

datadrivenangel•1h ago

Fourth place means you're not reliant on any of the external providers for internal AI use, which is important for organizational health and negotiating with those other providers.

lairv•55m ago

If the model is truly on par with Opus 4.6/Gemini 3.1/GPT 5.4 (beyond benchmarks) this still puts MSL in the frontier lab category, which is no small feat given that they pretty much rebooted last year

Many labs aren't able to keep up with the frontier, xAI, Mistral

blahblaher•1h ago

Why would you use this instead of the other more proven models? Unless it's significantly cheaper. The general population mostly wants it free, and the more professional users are willing to pay for good/better responses.

NitpickLawyer•45m ago

You wouldn't use this as an API. You would "use" this inside the meta properties. Have a shop on fb marketplace? Now you have copy, images, support, chat, translations, erp, esp, fps and all the other acronyms :) and so on for your mom and pop shop @200$/mo. Probably worse than say claude/gemini but it's right there, one button away. "Click here to upgrade to AI++" or something.

gallerdude•11m ago

But rolling your own can’t be that much cheaper than buying it from a leading lab. Especially when you consider the amount of spending on datacenters.

hnav•4m ago

leading labs are going to be tightening the screws. Otherwise why not just run the entire company on a public cloud?

gordonhart•25m ago

I won't use it, but I'm excited to see it for the same reason why I'm excited to see a near-frontier open-source release: more competition pushes prices down and reduces monopoly/cartel risk. I won't use Muse or Grok or GLM at this point but they're good for the ecosystem.

dgellow•1h ago

I never understood why meta decided to join the race. They don’t sell compute like Google or Microsoft. Why not let others do the hard work and integrate their LLMs in your systems if needed? I assume it’s because they have Instagram, Facebook, WhatsApp, Thread data and feel they should be the ones using them for training, but it’s really not obvious how having a frontier AI lab benefits their business

gallerdude•1h ago

I’m sure there’s more to it than this, but it feels like Zuck has pet interests like VR and now AI.

alex1138•1h ago

But no account support, that's boring

Or any quality control (people missing posts)

Or banning the people who should be banned while leaving everyone else alone

This is Zuck: https://news.ycombinator.com/item?id=4151433 or https://news.ycombinator.com/item?id=10791198

yoz-y•1h ago

AI NPCs to fill in the empty Metaverse?

awestroke•1h ago

Because Zuck has chronic FOMO, he's said as much himself

zeroonetwothree•1h ago

But then how will Zuck win the billionaire dick measuring contest?

chairmansteve•1h ago

Pumps up the stock price.

xnx•1h ago

Zuck is trying to convince himself he's good, and not just lucky.

swyx•58m ago

you dont understand why zuck, who paid $1B for instagram when they had no revenue and 7 employees because he is paranoid about platform shifts, decided to join the race for (what is seeming highly possibly) the biggest platform shift in human history?

oceansky•49m ago

He also tried and failed to buy Snapchat, and then copied their feature on all their big products: Instagram, Facebook and even WhatsApp.

prodigycorp•46m ago

The way you put it, I understand it less. lol

observationist•57m ago

Adtech Money. They've got GPUs, they've got the infrastructure, and they've got the advertisement platform, and the point is getting AI that can exploit the adtech and create a flywheel effect, maximizing return from the data they collect from Insta, WhatsApp, Facebook, etc.

It's not just about LLMs, it's about being able to model consumers and markets and psychology and so on. Meta is also big in the manipulation side of things, any sort of cynical technological exploitation of humans you can imagine but that is technically legal, they're doing it for profit.

bee_rider•50m ago

I think they just want to be a winner in the “next thing.” They hit social networking, but missed mobile operating systems and didn’t compellingly win at social media. Eventually an ambitious person with a bazillion dollars wants a clear win, right?

eldenring•46m ago

Because there's a realistic chance this is the only important software technology moving forward, and commoditizes Metas's entire business which is software.

chermi•41m ago

You basically have to be involved if you're meta. Even if there's only 5% chance this AI stuff is as disruptive as the labs claim it is, you can't afford to miss out. Even if you're lagging frontier, you must develop the competency internally. Otherwise you ignored a 5% chance of total annihilation, probably even exposing you to shareholder lawsuits.

KaiserPro•35m ago

A few things:

1) meta was doing this at scale before openAI

2) decent ML is critical to catagorising content at scale, the more accurate and fast the category, the finer the recommendations can be (ie instead of woman, outside as a tag for a video, woman, age, hair colour, location, subjects in view, main subject of video, video style) doing that as fast as possible with as little energy as possible is mission critical

3) The llama leak basically evaporated the moat around openAI who _could_ have become a competitor

4) for the AR stuff, all of these models (and visual models) are required to make the platform work. They also need complete ownership so that it can be distilled to make it run on tiny hardware

5) dick swinging

6) they genuinely want to become a industrial behemoth, so robots, hardware, etc are now all in scope.

vinni2•28m ago

From what I heard Meta is spending hundreds of millions each month in Claude credits for developers. So that’s a huge saving if they have own models that match Opus.

addandsubtract•8m ago

To download all those torrents, obviously.

throwaw12•1h ago

> I don’t think Zuck’s vision is particularly compelling.

But he has to do it anyways, otherwise Meta can be disrupted easily.

Google, Apple has hardware, distribution channels for their products

Amazon has the marketplace and cloud

Microsoft has enterprise and cloud

Meta is always looking for ways to stay afloat

xnx•1h ago

Meta has 3.5 billion daily active users

throwaw12•54m ago

and has competitors like: TikTok, SnapChat, YouTube, Netflix, X, HBO, Amazon Prime, all fighting for the attention time.

They are worried something like Sora can disrupt them quickly

chrsw•1h ago

So Meta is not releasing open source models anymore?

sidcool•1h ago

Will experiment with the model. But I am scared of sharing any information with the Zuck ecosystem.

ensen•1h ago

earlier: https://news.ycombinator.com/item?id=47692043

toddmorey•1h ago

Question: since they've rebooted their approach to AI... have they given up on open models? There's no mention of open source or open weights or access to the models beyond their hosted services.

thegeomaster•1h ago

Alexandr Wang on Twitter [0] mentioned open source plans:

"this is step one. bigger models are already in development with infrastructure scaling to match. private api preview open to select partners today, with plans to open-source future versions. incredibly proud of the MSL team. excited for what’s to come!"

https://x.com/alexandr_wang/status/2041909388852748717

prodigycorp•49m ago

So the answer is: no. lol. Remember Llama 4 Behemoth, and how we were supposed to get more great models from it?

wmf•1h ago

This may be too large to run locally anyway. Maybe they will distill down some smaller open versions later.

OsrsNeedsf2P•1h ago

The only benchmark they show against SOTA models is in bioweapons refusal.

Edit: nvm I can't read, regular benchmarks against SOTA are there

santiagobasulto•1h ago

This looks like a very interesting model and very promising, especially after llama lost so much ground recently. I hope they release the weights

visioninmyblood•1h ago

https://meta.ai/ this is where you can try it seems like the API is not publicly accessable yet. I feel they are very late to the game and do not show value to customers over other models.

p_stuart82•59m ago

late isn't the problem. private preview api and no reason to switch. that's just another hosted model

throwaw12•1h ago

How is that Meta spent so much money for talent and hardware, but the model barely matches Opus 4.6?

Especially, looking at these numbers after Claude Mythos, feels like either Anthropic has some secret sauce, or everyone else is dumber compared to the talent Anthropic has

zozbot234•1h ago

> has some secret sauce

Yup, it's called test-time compute. Mythos is described as plenty slower than Opus, enough to seriously annoy users trying to use it for quick-feedback-loop agentic work. It is most properly compared with GPT Pro, Gemini DeepThink or this latest model's "Contemplating" mode. Otherwise you're just not comparing like for like.

throwaw12•1h ago

> it's called test-time compute.

Why can't others easily replicate it?

coder68•57m ago

I have not delved into the theory yet but it seems that the smaller open-source models do this already to an extent. They have less parameters, but spend much more time/tokens reasoning, as a way to close the performance gap. If you look at "tokens per problem" on https://swe-rebench.com/ it seems to be the case at least.

strulovich•1h ago

Meta did a bunch of mistakes, and look like Zuckerberg spent a lot of money on talent and made big swings to change it (that happened about a year ago)

I think it’s unrealistic to expect them to come back from that pit to the top in one year, but I wouldn’t rule them out getting there with more time. That’s a possible future. They have the money and Zuckerberg’s drive at the helm. It can go a long way.

solenoid0937•1h ago

It's benchmaxxed.

If they actually matched Opus 4.6 on such a short timeline, it would have been mighty impressive. (Keep in mind this is a new lab and they are prohibited from doing distills.)

throwaw12•1h ago

how do you know it's benchmaxxed?

solenoid0937•56m ago

Friends at Meta with access to the model + personal experience at Meta.

Meta's performance process is essentially "show good numbers or you're out." So guess what people do when they don't have good numbers? They fudge them. Happens all across the company.

prodigycorp•51m ago

meta's benchmaxing tendencies are well known. llama4 was mega benchmaxxed, there's nothing that suggests to me that meta's culture has changed.

luma•30m ago

For one, they aren't using the latest version of many of the benchmarks. eg, ARC-AGI 2 and not 3, etc.

impulser_•1h ago

It's not even on par with Sonnet. It's on par with open source models and it not even open source and sit behind a private preview API.

Might as well not release anything.

username223•1h ago

Facebook is working with the talent that can’t find a job at some other company. It doesn’t surprise me they ship mediocrity.

coffeebeqn•57m ago

Matching Opus 4.6 would be pretty good? It’s the SOTA actually available model

reissbaker•35m ago

Muse Spark doesn't even match GLM-5.1 on most benchmarks. And GLM is open source!

rvz•1h ago

Until you actually try the model itself, assume any benchmark presented to you as being part of the marketing material of the model, as it is not independently verified and completely biased.

The same is true with any other model, unless otherwise stated.

In the next few days, we'll see who Meta has paid to promote this model on social media.

creddit•1h ago

Ran some of my internal benchmarks against this and I'm very unimpressed. I don't think this moves them into the OAI v Anthropic v Gemini conversation at all.

Major analytical errors in their response to multiple of my technical questions.

creddit•58m ago

Playing with this some more and it's actively not good. Just basic mathematical errors riddling responses. Did some basic adversarial testing where its responses are analyzed by Gemini and Gemini is finding basic math errors across every relatively (relative to Opus, Gemini or GPT can handle) simple ask I make. Yikes.

oliver236•1h ago

so glad its beating all the others on bioweapons refusal. this is what i most wanted out of the latest SOTA model

wmf•1h ago

Zuck has a lot more experience being summoned before Congress than you.

ChrisArchitect•1h ago

Some more discussion: https://news.ycombinator.com/item?id=47692043

ComputerGuru•1h ago

So does this confirm the end of llama?

ehutch79•1h ago

How's the metaverse doing? It was the next big thing and how we're all going to be working inside it in... was it like 3 months ago?

Maybe they need to mine more libra coin first? or is it diem now? is that even still part of meta?

I'm sure this new AI is super intelligent and super awesome and will be writing all the code, making all the blog posts, and generating all our youtube shorts in 6 months.

captn3m0•52m ago

Libra/Diem got sold to the bank they were partnering with (Silvergate) for $200M, which then filed for Bankruptcy.

https://en.wikipedia.org/wiki/Diem_(digital_currency)

serf•52m ago

what's with the negativity?

yeah, the metaverse got abandoned. Also: Meta was the only one to try the concept for the past X-umpteen years even though everyone in the industry ga-gas over virtual reality worlds and workplaces at every opportunity. It's literally Meta and Linden Labs (which has been on life support for 10+ years.)

The alternative is : no one does it and nothing gets abandoned, which the industry has shown itself to be exceedingly good at w.r.t VR for the past 40+ years.

To be clear: I have no faith in meta as a company; my problem lies in kicking an entity because they attempted something different.. I don't think that's productive, and it produces stuff like the past AI winters because groups get afraid of touching experimental concepts ever again lest they incur the wrath of the shareholder.

ehutch79•41m ago

It's not the failure here or there, it's a pattern. It's not even the failing, it's the excessive hype cycle.

We keep seeing things being overhyped, with not much thought behind it. Meta is particularly bad about it. They changed their name for the hype of their VR product, when VR was still niche and had a long way to go, and still does. They couldn't even figure out legs for launch.

Now they have a 'superintellegence'? Yeah, that sounds like just the latest in a line of bullshit. Why would this be different.

sva_•50m ago

> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

https://news.ycombinator.com/newsguidelines.html

ehutch79•47m ago

Establishing a pattern of over hyping of projects that then disappear isn't a shallow dismissal.

jansport123•48m ago

did they just copy the chatgpt ui?

tty456•47m ago

I don't get the comments trashing this. If it slightly beats or even matches Opus 4.6, it means Meta is capable of building a model competitive with the leading AI company. Sure, they spent a lot of money and will have on-going costs. But how much more work would it take to turn that into a coding agent people are willing to try (and pay for) along side their usage of a collection of agents (Claude, Codex, etc)? Also means Meta doesn't have to pay another company to use a SATA model across all their products (including IG and WhatsApp, vr) which will matter to their balance sheet long term (despite the constant r&d spend).

prodigycorp•41m ago

Comments trashing this are rightly correct skeptics who remember the benchmaxxing of llama 4. This model was out in the woods as early as like a couple months ago but they didn't release it because it was at gemini 2.5 pro levels.

zozbot234•36m ago

The llama4 series was one of the earliest large MoE's to be made publically available. People just ignored it because they were focused on running smaller and denser models at the time, we should know better these days.

prodigycorp•28m ago

the models were objectively horrible

NitpickLawyer•20m ago

They really weren't horrible. They were ~gpt4o, with the added benefit that you could run them on premise. Just "regular" models, non "thinking". Inefficient architecture (number of active out of total) but otherwise "decent" models. They got trashed online by bots and chinese shills (I was online that weekend when it happened, it's something to behold). Just because they were non-thinking when thinking was clearly the future doesn't make them horrible. Not SotA by any means, but still.

prodigycorp•14m ago

Nah I remember how disgusted I felt trying llama 4 maverick and scout. They were both DOA.. couldn't even beat much smaller local models.

dilap•5m ago

Deepseek R1 was a publically-available, MoE model that was getting a ton of attention before llama4. Llama4 didn't get much attention because it wasn't good.

redox99•35m ago

> If it slightly beats or even matches Opus 4.6

It doesn't though

ryeguy_24•30m ago

Curious on why you think this. Any data points that led you to this?

howdareme•20m ago

The benchmarks they released

gritspants•26m ago

I would like someone to tell me how stupid I am. If I were Meta/Zuck I'd open source a great model the moment my company developed it. This just looks like a pitch to investors, otherwise.

jamiequint•23m ago

"This just looks like a pitch to investors"

The goal of public companies is generally to generate profit for their investors.

samrus•19m ago

Im beginning to think thats the mantra we'll keep reciting as this whole country slowly falls apart

ChipopLeMoral•13m ago

> I don't get the comments trashing this.

People like to hate on Meta regardless of anything, and regardless of whether it's justified or not. Not saying it isn't, just that it's many people's default bias.

1970-01-01•46m ago

I can remember when AOL was an unstoppable giant. Except it wasn't. People eventually realized they could get a better, cheaper, faster experience with ISPs and search engines. The same path is unfolding before Meta. People have much better options, and plethora of Meta users will slowly leave until the big moat is drained. Zuck, go retire to your NZ bunker before Meta is forced to merge with another media company.

eranation•42m ago

So this is why Anthropic rushed the weirdest "pre-responsible-disclosure-totally-not-for-marketing" announcement yesterday? To make sure Spark doesn't steal their thunder? (Spark beats Opus 4.6 on some benchmarks...). Or did I become a bitter cynical old man.

hnav•39m ago

It's giving "OpenAI says its new model GPT-2 is too dangerous to release (2019)"

bguberfain•35m ago

We all know it... but I think they were very bold in this warning about using your private messages to train public models. _Your messages with AIs will be used to improve AI at Meta. Don't share information, including sensitive topics, about others or yourself that you don't want the AI to retain and use_

discopicante•30m ago

meta doesn't exactly instill confidence on using personal data responsibly. hard pass

Kuyawa•34m ago

> Meta AI isn't available yet in your country

Not my loss, will keep using DeepSeek then. Wake me up when my country is no longer in the wrong/right side of history.

vinni2•30m ago

I have to create meta account to access. No thanks.

ge96•16m ago

funny how websites do that thing where it looks like you can use the product but soon as you hit enter, nope login first

edwcross•14m ago

What is the "BioTIER-refuse" thing mentioned in the "Bioweapons Refusal" graph?

I Googled it and found absolutely nothing.

Well, to be honest, I got 100% of websites containing the French word "boîtier" (box) with a typo.

Even on Google Scholar, the closest match is "BioTiER (Biological Training in Education and Research) Scholars Program", which is at least 10 years old and has nothing to do with that.

Is that an AI-generated image with an AI-generated name that has no physical existence?

EnderWT•10m ago

https://securebio.org/biotier/

binaryturtle•10m ago

Looks like it needs a meta account? As soon you hit enter it wants to log-in. I guess I won't try this any time soon. :)

tekacs•5m ago

https://meta.ai/share/pe4HxOfv2Bp

Finding a little bit tricky to evaluate because the harness is unfortunately very, very bad (e.g. search is awful). Can't wait to try this in some real external services where we can see how it performs for real.

Definitely getting ordinary high-quality results, overall. But hard to test the genetic behavior and hard to test pros quality, even, when just working off of the default chat interface.

Is the Military Breeding Aliens and Humans? [video]

Under the hood of MDN's new front end

You've got 41 days before chip prices skyrocket

Achievable to reach 3–5 microseconds end-to-end order latency

Use cases for autonomous AI agents

Soderbergh to Direct Spanish-American War Movie That Will Use "A Lot of AI"

Food shock is inevitable due to the Iran war

I Analyzed 512,000 Lines of Leaked Code.it Shows What's Coming for Your AI Tools [video]

macOS has a 49.7-day networking time bomb built in that only a reboot fixes

Bypass Netflix's Household Verification

LotRProject, visualizing Tolkein's works on the web

Ask HN: How many keystrokes do you type per day?

Show HN: Application management app (for job applications)

Behind the Pretty Frames: Pragmata

Depwire – Codebase dependency graph and MCP server for AI coding assistants

AI for Alzheimer's

DARPA puts money where bots' mouths are, seeks new science of AI communication

Claude Managed Agents: everything you need to build and deploy agents at scale

I've been waiting over a month for Anthropic support to respond

The Future of Everything Is Lies, I Guess: Dynamics

NERC is 'actively monitoring the grid' following Iran-linked cyber threat

AI-to-Butt Chrome Plugin

1SubML: Plan vs. Reality

Fragile U.S.-Iran ceasefire shows cracks as attacks continue across the region

Plain of Jars Archaeological Project (Pjarp)

Prevent confidential data leaks at compile time with labelled types in Sigil

Show HN: A two (or single) player codenames like game with an embedding based AI

How Costco Won in Japan

Dux – Distributed DuckDB-Native DataFrames for Elixir

Show HN: Palinode – Git-versioned Markdown memory for AI agents

Muse Spark – Meta Superintelligence Labs

Comments

Is the Military Breeding Aliens and Humans? [video]

Under the hood of MDN's new front end

You've got 41 days before chip prices skyrocket

Achievable to reach 3–5 microseconds end-to-end order latency

Use cases for autonomous AI agents

Soderbergh to Direct Spanish-American War Movie That Will Use "A Lot of AI"

Food shock is inevitable due to the Iran war

I Analyzed 512,000 Lines of Leaked Code.it Shows What's Coming for Your AI Tools [video]

macOS has a 49.7-day networking time bomb built in that only a reboot fixes

Bypass Netflix's Household Verification

LotRProject, visualizing Tolkein's works on the web

Ask HN: How many keystrokes do you type per day?

Show HN: Application management app (for job applications)

Behind the Pretty Frames: Pragmata

Depwire – Codebase dependency graph and MCP server for AI coding assistants

AI for Alzheimer's

DARPA puts money where bots' mouths are, seeks new science of AI communication

Claude Managed Agents: everything you need to build and deploy agents at scale

I've been waiting over a month for Anthropic support to respond

The Future of Everything Is Lies, I Guess: Dynamics

NERC is 'actively monitoring the grid' following Iran-linked cyber threat

AI-to-Butt Chrome Plugin

1SubML: Plan vs. Reality

Fragile U.S.-Iran ceasefire shows cracks as attacks continue across the region

Plain of Jars Archaeological Project (Pjarp)

Prevent confidential data leaks at compile time with labelled types in Sigil

Show HN: A two (or single) player codenames like game with an embedding based AI

How Costco Won in Japan

Dux – Distributed DuckDB-Native DataFrames for Elixir

Show HN: Palinode – Git-versioned Markdown memory for AI agents