No.
Like everything else an LLM touches, it is prone to slop and hallucinations.
You still need someone who knows what they are doing to review (and preferably manually validate) the findings.
What all this recent hype carefully glosses over is the volume of false-positives. I guarantee you it is > 0 and most likely a fairly large number.
And like most things LLM, the bigger the codebase the more likely the false-positives due to self-imposed context window constraints.
Its all very well these blog posts saying "LLM found this serious bug in Firefox", well yeah but that's only because the security analyst filtered out all the junk (and knew what to ask the LLM in the prompt in the first place).
Another way to see this is that you mentioned "LLM found this serious bug in Firefox", but the actual number in that Mozilla report [2] was 14 high-severity bugs, and 90 minor ones. However you look at it, it's an impressive result for a security audit, and I dount that the Antropic team had to manually filter out hundreds-to-thousands of false-positives to produce it.
They did have to manually write minimal exploits for each bug, because Opus was bad at it[3]. This is a problem that Mythos doesn't have. With access to Mythos, to repeat the same audit, you'd likely just need to make the model itself write all the exploits, which incidentally would also filter out a lot of the false positives. I think the hype is mostly justified.
[1] https://lwn.net/Articles/1065620/
[2] https://blog.mozilla.org/en/firefox/hardening-firefox-anthro...
In terms of quality ("are there bugs that professional humans can't see at any budget but LLMs can?") - it's not very clear, because Opus is still worse than a human specialist, but Mythos might be comparable. We'll just have to wait and see what results Project Glasswing gets.
Either way, cybersecurity is going to get real weird real soon, because even slightly-dumb models can have a large effect if they are cheap and fast enough.
EDIT: Mozilla thinks "no" to the second question, by the way: "Encouragingly, we also haven’t seen any bugs that couldn’t have been found by an elite human researcher.", when talking about the 271 vulnerabilities recently found by Mythos. https://blog.mozilla.org/en/firefox/ai-security-zero-day-vul...
Being flooded with these kind of reports can make the actual real problems harder to see.
The plural of "Opus" is "Opera". Might be a tad confusing tho :)
Of course some people don't do that, and send all the reports anyway... and then scream from the hilltops about how incredible LLMs are when by sheer luck one happens to be right. Not only is that blatant p-hacking, it's incredibly antisocial.
It's disingenuous marketing speak to say LLMs are "finding" any security holes at all: they find a thousand hypotheticals of which one or two might be real. A broken clock is right twice a day.
staticassertion•1h ago
sigmoid10•1h ago
bastawhiz•30m ago
- Nobody is familiar with the code
- Almost all of the recent fixes are from static analysis
- Nobody is even sure if anyone uses the code
This feels a lot like CPython culling stdlib modules and making them pypi packages. The people who rely on those things have a little bit of extra work if they want a recent kernel version, and everyone else benefits (directly or indirectly) by way of there being less stuff that needs attention.
fluidcruft•1h ago
The overlap of bugs being found, nobody caring enough to bother read the reports or fix the code, and nobody caring that the modules are pushed out of main seems good.
traceroute66•1h ago
Yes, I don't see the point of maintaining technical debt just for the sake of it.
The security environment in 2026 is such that legacy unmaintained code is a very real security risk for obscure zero-days to exploit to gain a foot in the door.
Reading through the list I don't see it being an issue for the overwhelming majority of Linux users.
Who, for example, still uses ISDN in 2026 ? Most telcos have stopped all new sales and existing ISDN circuits will be forcefully disconnected within 3–5 years as the telcos complete their FTTP build-outs and the copper network is subsequently decomissioned.
goalieca•31m ago
In general, drivers make up the largest attack surface in the kernel and many of them are just along for the ride rather than being actively maintained and reviewed by researchers.