frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Mozilla says 271 vulnerabilities found by Mythos and "almost no false positives"

https://arstechnica.com/information-technology/2026/05/mozilla-says-271-vulnerabilities-found-by-mythos-have-almost-no-false-positives/
72•epistasis•3h ago

Comments

lschueller•3h ago
Let's see, how this will improve the daily soc work. I still don't see, what's the big difference between Mythos and Opus, security wise. I'm confident, that this kind of vul detection is a long-term improvement. But does specifically Mythos makes such a big difference to "normal" models? I would love to see, what's the actual difference.
JoshTriplett•3h ago
Among other things, Mythos seems better at "let me find, weaponize, and stack vulnerabilities until I get end-to-end from untrusted content to root", rather than just finding one thing in a specific identified area.
mccr8•1h ago
Quantifying the abilities of an LLM is a hard research problem, so I'm not sure if I can describe it in any great way, but Mythos did seem to be fairly clever about putting together things from different domains to find problems.

For instance, in one of the included bugs (2022034) it figured out that a floating point value being sent over IPC could be modified by an attacker in such a way that it would be interpreted by the JS engine as an arbitrary pointer, due to the way the JS engine uses a clever representation of values called NaN-boxing. This is not beyond the realm of a human researcher to find, but it did nicely combine different domains of security.

As the person responsible for accidentally introducing that security problem (and then fixing it after the Mythos report), while I am aware of NaN-boxing (despite not being a JS engine expert), I was focused more on the other more complex parts of this IPC deserialization code so I hadn't really thought about the potential problems in this context. It is just a floating point value, what could go wrong?

lschueller•57m ago
Okay, so far it makes sense to me. But is the deal with JS and floating point values, which isn't soemthing super special super rare stuff, only detected and identfied by Mythos while Opus wouldn't get to this point?
IainIreland•21m ago
There doesn't have to be a huge qualitative discontinuity between Opus and Mythos. It's just that Mythos has reached a threshold where it's finally smart enough that putting it in a loop and asking it to find bugs is suddenly really effective. Especially at the beginning, Mozilla wasn't doing anything particularly clever with prompts. Mythos is just smart enough that the hit rate on obvious prompts is high enough to matter. (Maybe you can get similar performance out of Opus 4.6 with really smart prompts, but AFAICT nobody had managed it until Mythos.)
input_sh•3h ago
Original source: https://news.ycombinator.com/item?id=48051079

It's better because it actually lists a sample of Bugzilla reports that were made public. This topic was discussed previously (36 comments two weeks ago: https://news.ycombinator.com/item?id=47885042), but the part about bug reports being made public is brand new.

ChrisArchitect•2h ago
[dupe] Discussion on source: https://news.ycombinator.com/item?id=48051079
MetaverseClub•2h ago
I'm curious about how did Mozilla do bug finding before Mythos? Did they use any non-AI bug finding tools?
mccr8•2h ago
The usual sorts of fuzzing and static analyses, using AddressSanitizer and ThreadSanitizer. Also, with a bug bounty program to try to encourage external researchers to report issues. (I work on Firefox security; also I fixed 2 of the bugs linked in the blog post.)
canucker2016•1h ago
Coverity (similar to lint) scans various open source software products for vulnerabilities.

see https://www.blackduck.com/static-analysis-tools-sast/coverit...

and for Firefox-related alleged defects, see https://scan.coverity.com/projects/firefox

You have to create an account to view the actual reported defects.

There are just over 5000 reported defects still outstanding. I don't know how many overlap with the reported 271 Mythos-reported defects.

rockdoe•54m ago
How many of those are false positives though? Probably just over 5000?

You get bug bounties if you report the kind of bugs Mythos identified. There's a reason no-one collected bounties from the "5000 defects" Coverity identified.

The Mythos reports have several examples of chaining a whole bunch of logic in different parts of the program together to exploit something very subtle. The Coverity reports aren't anything like that. These tools aren't remotely in the same league or even universe.

IainIreland•19m ago
Yeah, fuzzing, sanitizers, and bug bounties were our main pre-AI tools for finding bugs.
jerrythegerbil•2h ago
Again, and this is important:

A bug is a bug. A “potential vulnerability” is a bug. A vulnerability is verifiable as having security implications with a proof of concept or other substantial evidence.

Words matter. Bugs matter. It’s important to fix large amounts of bugs, just as it always has been, and has been done. Let that be impressive on its own, because it IS impressive.

Mythos didn’t write 271 PoC for vulnerabilities and demonstrate code path reachability with security implications. Mythos found 271 valid bugs. Let that be enough.

epistasis•1h ago
I was a bit confused by your definitions, but here's how Mozilla broke out [1] the 271, um, things:

> As additional context, we apply security severity ratings from critical to low to indicate the urgency of a bug:

> * sec-critical and sec-high are assigned to vulnerabilities that can be triggered with normal user behavior, like browsing to a web page. We make no technical difference between these, but sec-critical bugs are reserved for issues that are publicly disclosed or known to be exploited in the wild.

> * sec-moderate is assigned to vulnerabilities that would otherwise be rated sec-high but require unusual and complex steps from the victim.

> * sec-low is assigned to bugs that are annoying but far from causing user harm (e.g, a safe crash).

> Of the 271 bugs we announced for Firefox 150: 180 were sec-high, 80 were sec-moderate, and 11 were sec-low.

Mozilla uses the term "vulnerability" for even sec-high, even though they say right below that it doesn't mean the same thing as a practical exploit. And on their definitional page, they classify even sec-low as "vulnerabilities" [2].

Words are tools, that get their utility from collective meaning. I'd be interested where you recieved your semantics from and if they match up or disagree with Mozilla.

[1] https://hacks.mozilla.org/2026/05/behind-the-scenes-hardenin...

[2] https://wiki.mozilla.org/Security_Severity_Ratings/Client

Gregaros•52m ago
> Mozilla uses the term "vulnerability" for even sec-high, even though they say right below that it doesn't mean the same thing as a practical exploit.

That’s not evident in what you pastedat all.

What you pasted says

> sec-critical and sec-high are assigned to vulnerabilities that can be triggered with normal user behavior […] We make no technical difference between these […] sec-critical bugs are reserved for issues that are publicly disclosed or known to be exploited in the wild.

> sec-low is assigned to bugs that are annoying but far from causing user harm (e.g, a safe crash).

From this one infers that the "180 were sec-high" bugs found are actually exploitsble but known to have been found in the wild, and are NOT mere annoying bugs.

The difference between 180 and 270 does nothing to deflate the signicance, or lack there of, of the implication re: Mythos.

epistasis•37m ago
Yes, it is not in what I pasted, as I said, "even though they say right below". If you don't believe me then click on either of the links.
throw0101c•50m ago
Presumably there are (implicit?) "sec-none" things, like [a] from the recently released 150.0.2 [b] which makes absolutely zero mention about "Security Impact" or "Severity" in the bug report, unlike [c], which is listed in the Mozilla weblog post [2].

Security things are mentioned in the Release Notes [b] pointing to a completely different document [d].

Perhaps sometimes a bug is 'just' a bug, and not a vulnerability.

[a] https://bugzilla.mozilla.org/show_bug.cgi?id=2034980 ; "Can't highlight image scans in Firefox 150+"

[b] https://www.firefox.com/en-CA/firefox/150.0.2/releasenotes/

[c] https://bugzilla.mozilla.org/show_bug.cgi?id=2024918

[d] https://www.mozilla.org/en-US/security/advisories/mfsa2026-4...

IainIreland•34m ago
I work at Mozilla; I fixed a bunch of these bugs.

In general, I would say that our use of "vulnerability" lines up with what jerrythegerbil calls "potential vulnerability". (In cases with a POC, we would likely use the word "exploit".) Our goal is to keep Firefox secure. Once it's clear that a particular bug might be exploitable, it's usually not worth a lot of engineering effort to investigate further; we just fix it. We spend a little while eyeballing things for the purpose of sorting into sec-high, sec-moderate, etc, and to help triage incoming bugs, but if there's any real question, we assume the worst and move on.

So were all 271 bugs exploitable? Absolutely not. But they were all security bugs according to the normal standards that we've been applying for years.

(Partial exception: there were some bugs that might normally have been opened up, but were kept hidden because Mythos wasn't public information yet. But those bugs would have been marked sec-other, and not included in the count.)

So if you think we're guilty of inflating the number of "real" vulnerabilities found by Mythos, bear in mind that we've also been consistently inflating the baseline. The spike in the Firefox Security Fixes by Month graph is very, very real: https://hacks.mozilla.org/2026/05/behind-the-scenes-hardenin...

epistasis•30m ago
I'm not a security dev or researcher or anything, but as an outsider my understanding matches how Mozilla uses the terms. Though words used by specialists and the general public can offer differ...
paulvnickerson•29m ago
What types of vulnerabilities was it finding? Cross site scripting, privilege escalation, etc? Mostly memory corruption or any Javascript logic bugs?
IainIreland•9m ago
I work on SpiderMonkey, so I mostly looked at the JS bugs. It was a smorgasbord of various things. Broadly speaking I'd say the most impressive bugs were TOCTOU issues, where we checked something and later acted on it, and the testcase found a clever way to invalidate the result of the check in between.

If you look closely at, say, this patch, you might get a sense of what I mean (although the real cleverness is in the testcase, which we have not made public): https://hg-edge.mozilla.org/integration/autoland/rev/c29515d...

crummy•28m ago
Curious if people think LLMs will lead to more secure or less secure software in five years.
int32_64•24m ago
Both. The skilled will use them to find problems, the unskilled will use them to slopcode insecure software the skilled will have to fix.
stavros•14m ago
That depends on which side has more money.
deferredgrant•23m ago
A vuln finder is useful only if it respects the humans on the other end. Every bogus report taxes the same scarce attention needed for the real bugs.

Dirtyfrag: Universal Linux LPE

https://www.openwall.com/lists/oss-security/2026/05/07/8
316•flipped•4h ago•150 comments

Canvas (Instructure) LMS Down in Ongoing Ransomware Attack

https://www.theverge.com/tech/926458/canvas-shinyhunters-breach
41•stefanpie•59m ago•2 comments

The Burning Man MOOP Map

https://www.not-ship.com/burning-man-moop/
504•speckx•9h ago•271 comments

Agents need control flow, not more prompts

https://bsuh.bearblog.dev/agents-need-control-flow/
275•bsuh•6h ago•152 comments

Natural Language Autoencoders: Turning Claude's Thoughts into Text

https://www.anthropic.com/research/natural-language-autoencoders
155•instagraham•5h ago•50 comments

AlphaEvolve: Gemini-powered coding agent scaling impact across fields

https://deepmind.google/blog/alphaevolve-impact/
232•berlianta•8h ago•91 comments

AI slop is killing online communities

https://rmoff.net/2026/05/06/ai-slop-is-killing-online-communities/
364•thm•4h ago•340 comments

DeepSeek 4 Flash local inference engine for Metal

https://github.com/antirez/ds4
249•tamnd•7h ago•74 comments

OpenClaw Had a Rough Week

https://openclaw.ai/blog/openclaw-rough-week
7•kevincortes•48m ago•0 comments

Draw Marc Andreessen on an Egg

https://eieio.games/blog/marc-andreessen-egg-game/
41•LorenDB•3h ago•8 comments

Two Home Affairs officials suspended after AI 'hallucinations' found

https://www.citizen.co.za/news/home-affairs-officials-suspended-ai-hallucinations/
25•jruohonen•3h ago•2 comments

I want to live like Costco people

https://tastecooking.com/i-want-to-live-like-costco-people/
185•speckx•8h ago•417 comments

Chrome removes claim of On-device Al not sending data to Google Servers

https://old.reddit.com/r/chrome/comments/1t5qayz/chrome_removes_claim_of_ondevice_al_not_sending/
412•newsoftheday•7h ago•156 comments

Rolling the Root Key

https://blog.apnic.net/2026/05/05/rolling-the-root-key/
6•jandeboevrie•2d ago•0 comments

Colored Shadow Penumbra

https://chosker.github.io/blog/colored-shadow-penumbra
30•ibobev•4h ago•11 comments

Principles for agent-native CLIs

https://twitter.com/trevin/status/2051316002730991795
48•blumpy22•5h ago•28 comments

Easy Random Trees

https://blog.wilsonb.com/posts/2026-02-27-easy-random-trees.html
16•aebtebeten•2d ago•1 comments

PySimpleGUI 6

https://github.com/PySimpleGUI/PySimpleGUI
84•geophph•2d ago•40 comments

Creating for a niche

https://www.davesnider.com/posts/working-in-a-niche
20•snide•3h ago•1 comments

RaTeX: KaTeX-compatible LaTeX rendering engine in pure Rust

https://ratex.lites.dev/
154•atilimcetin•3d ago•86 comments

Child marriages plunged when girls stayed in school in Nigeria

https://www.nature.com/articles/d41586-026-00720-8
324•surprisetalk•9h ago•245 comments

The Self-Cancelling Subscription

https://predr.ag/blog/the-self-cancelling-subscription/
137•surprisetalk•9h ago•60 comments

OpenBSD Stories: The closest thing to cute kittens (OpenBSD/zaurus)

http://miod.online.fr/software/openbsd/stories/zaurus1.html
58•zdw•1d ago•7 comments

OurCar: What I learned making an app for my family

https://mendelgreenberg.com/posts/ourcar/
96•chabad360•1d ago•69 comments

Show HN: TRUST – Coding Rust like it's 1989

https://github.com/wojtczyk/trust
109•wojtczyk•17h ago•70 comments

Boris Cherny: TI-83 Plus Basic Programming Tutorial (2004)

https://www.ticalc.org/programming/columns/83plus-bas/cherny/
180•suoken•3d ago•80 comments

I switched from Mac to a Lenovo Chromebook

https://blog.johnozbay.com/i-left-apples-ecosystem-for-a-lenovo-chromebook-and-you-can-too.html
103•speckx•7h ago•135 comments

GovernGPT (YC W24) Is Hiring Engineers to Build Thinking Systems in Montreal

https://www.ycombinator.com/companies/governgpt/jobs/hRyltS0-backend-engineer-thinking-systems
1•owalerys•11h ago

Mozilla says 271 vulnerabilities found by Mythos and "almost no false positives"

https://arstechnica.com/information-technology/2026/05/mozilla-says-271-vulnerabilities-found-by-...
74•epistasis•3h ago•29 comments

ZAYA1-8B matches DeepSeek-R1 on math with less than 1B active parameters

https://firethering.com/zaya1-8b-open-source-math-coding-model/
87•steveharing1•14h ago•50 comments