frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Project Glasswing: An Initial Update

https://www.anthropic.com/research/glasswing-initial-update
77•louiereederson•1h ago•37 comments

Why Japanese companies do so many different things

https://davidoks.blog/p/why-japanese-companies-do-so-many
344•d0ks•5h ago•206 comments

U.S. researchers face new restrictions on publishing with foreign collaborators

https://www.science.org/content/article/u-s-researchers-face-new-restrictions-publishing-foreign-...
225•ceejayoz•4h ago•131 comments

Open source Kanban desktop app that runs parallel agents on every card

https://www.kanbots.dev/
71•vitriapp•2h ago•38 comments

A scoping review of bicycling interventions’ impacts on well-being

https://www.frontiersin.org/journals/sports-and-active-living/articles/10.3389/fspor.2026.1807791...
47•gnabgib•2h ago•27 comments

1940 Air Terminal Museum Begins Liquidation

https://www.1940airterminal.org/news/liquidation-of-simulators
43•weaponeer•3h ago•11 comments

Deno 2.8

https://deno.com/blog/v2.8
229•roflcopter69•9h ago•110 comments

Antigravity 2.0 Tops the OpenSCAD Architectural 3D LLM Benchmark

https://modelrift.com/blog/openscad-llm-benchmark/
309•jetter•9h ago•118 comments

Bun support is now limited and deprecated

https://github.com/yt-dlp/yt-dlp/issues/16766
216•tamnd•3h ago•177 comments

Robert X Cringely is back to blogging

https://www.cringely.com/
28•dan_hawkins•5h ago•8 comments

A Forth-inspired language for writing websites

https://robida.net/entries/2026/05/21/a-forth-inspired-language-for-writing-websites
77•speckx•5h ago•11 comments

Lawmakers Demand Answers as CISA Tries to Contain Data Leak

https://krebsonsecurity.com/2026/05/lawmakers-demand-answers-as-cisa-tries-to-contain-data-leak/
45•speckx•3h ago•3 comments

Launch HN: Superset (YC P26) – IDE for the agents era

https://github.com/superset-sh/superset
56•avipeltz•5h ago•70 comments

If you’re an LLM, please read this

https://annas-archive.gl/blog/llms-txt.html
643•janandonly•9h ago•374 comments

DeepSeek makes the V4 Pro price discount permanent

https://api-docs.deepseek.com/quick_start/pricing
182•Tiberium•4h ago•94 comments

Microsoft starts canceling Claude Code licenses

https://www.theverge.com/tech/930447/microsoft-claude-code-discontinued-notepad
114•robertkarl•3h ago•79 comments

TorQ: Kdb+ Production Framework

https://github.com/DataIntellectTech/TorQ
13•tosh•3h ago•3 comments

Domain-Camouflaged Injection Attacks Evade Detection in Multi-Agent LLM Systems

https://arxiv.org/abs/2605.22001
8•sbulaev•1h ago•0 comments

Project Hail Mary – Stellar Navigation Chart

https://valhovey.github.io/gaia-mary/
1101•speleo•1d ago•225 comments

Show HN: ShadowCat – file transfer through QR Codes in a Browser

https://github.com/unprovable/ShadowCat
115•unprovable•9h ago•42 comments

How to convert between wealth and income tax

https://paulgraham.com/winc.html
102•bifftastic•4h ago•318 comments

Circle Medical (YC S15) Is Hiring a Mobile Engineer

https://www.ycombinator.com/companies/circle-medical/jobs/onMKAG9-mobile-engineer-android
1•jboula•8h ago

The memory shortage is causing a repricing of consumer electronics

https://davidoks.blog/p/ai-is-killing-the-cheap-smartphone
420•d0ks•22h ago•525 comments

Thinking in an array language (2022)

https://github.com/razetime/ngn-k-tutorial/blob/main/12-thinking-in-k.md
8•tosh•2h ago•1 comments

AI has a multiplying effect on existing technical skills

https://www.joshwcomeau.com/email/wham-launch-005-elephant-2-p/
238•moebrowne•7h ago•234 comments

Cleve Moler has died

https://www.mathworks.com/company/aboutus/founders/clevemoler.html
247•mychele•17h ago•25 comments

Chess invariants

http://muratbuffalo.blogspot.com/2026/05/chess-invariants.html
75•ingve•9h ago•45 comments

Slumber a TUI HTTP Client

https://slumber.lucaspickering.me
157•jicea•16h ago•57 comments

CBS Radio signs off after nearly 100 years of broadcasting

https://www.cbsnews.com/news/cbs-news-radio-last-day/
40•gscott•3h ago•19 comments

Linux Sound Subsystem Also Seeing Many Fixes Driven by AI/LLMs

https://www.phoronix.com/news/Linux-7.1-Sound-Many-Fixes
11•dboon•1h ago•0 comments
Open in hackernews

Project Glasswing: An Initial Update

https://www.anthropic.com/research/glasswing-initial-update
66•louiereederson•1h ago

Comments

OsrsNeedsf2P•57m ago
The vulnerabilities found continues to impress, and make legacy media, Twitter and Youtube go nuts. But we still have no data to prove this wasn't doable with the same initiative backed by Opus 4.7, and there is no GA for Mythos access.
bobbycastorama•48m ago
I've seen a blog post by a security researcher saying that he was able to find the same vulnerabilities (for Firefox IIRC) with a ~30B params LLM...

So yeah, huge marketing as always.

wiwiwq•39m ago
To me it’s clear what’s going on.

The American firms are focused on marketing now to convince people to not even consider open sourced models / open weight models as they are inferior (that’s what they want you to believe).

rhubarbtree•37m ago
IPO is coming is what is going on
wiwiwq•35m ago
That’s implicit in my post.

If people actually believe the narrative then the bankers will over price Anthropic and get away with it.

Brystephor•33m ago
Did the security researcher point the LLM at the blob of information and say "Find vulnerabilities" or was the LLM told to "determine if vulnerability X is present in this blob"? Confirmation of suspected vulnerabilities is a different problem from finding vulnerabilities.
krisbolton•22m ago
This is different though right? He found one (? we don't know who you're referring to - post sources for a higher quality discussion) vulnerability, he already knew it was there, etc. Anthropic didn't claim no other model can find vulnerabilities, nor that it's impossible with smaller models. They're claiming Mythos is a step-change in ability for end-to-end vulnerability discover and exploit creation. And that other frontier models are close behind.
boston_clone•46m ago
you would likely be quite interested in the more quantitative writeup from a real research team ! it’s linked about midway in to the article - similar functionally can be reached, yes, but not always and never with fewer tokens than what mythos requires.

https://xbow.com/blog/mythos-offensive-security-xbow-evaluat...

OsrsNeedsf2P•38m ago
Ok this is actually a pretty good article and justifies the step function marketing in security they talked about
pertymcpert•46m ago
> Mozilla found and fixed 271 vulnerabilities in Firefox 150 while testing Mythos Preview—over ten times more than they found in Firefox 148 with Claude Opus 4.6

4.6 but close.

OsrsNeedsf2P•41m ago
Right, but were they using the same methodology and harness? I'm skeptical that they're doing something with the harness - i.e. with Mythos, they pass each file in one at a time, whereas on 4.6 they let Claude Code run loose to find bugs. This would have a larger impact difference than the model itself.
parker-3461•45m ago
Makes me wonder if Anthropic is really having issues with allocating compute (see recent deals with xAI and SpaceX). From available benchmarks, it seems like similar results should be possible with GPT 5.5 Pro or Opus 4.7 (with specific cybersecurity trained models).
wiwiwq•36m ago
Who knows but from a valuation stand point it’s better to signal that demand is higher than existing capacity..
smoe•33m ago
At least according to this, GPT-5.5 Cyber is on par with Mythic, as the only two models that were able to finish their 32-step corporate network attack simulation.

https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5...

energy123•42m ago
. Mozilla found and fixed 271 vulnerabilities in Firefox 150 while testing Mythos Preview—over ten times more than they found in Firefox 148 with Claude Opus 4.6;
applfanboysbgon•29m ago
Did they allocate the same number of tokens to looking with Claude 4.6? Or did they find more because they looked more, owing to a special initative by Anthropic?
properbrew•21m ago
> over ten times more than they found in Firefox 148 with Claude Opus 4.6

And how much with Opus 4.7? 5x?

kllrnohj•18m ago
No, not really. Mythos found 3 CVEs, not 271.

https://www.flyingpenguin.com/mythos-mystery-in-mozilla-numb...

moyix•10m ago
I think you're confusing CVEs and vulnerabilities here? Mozilla (per their longstanding practice) grouped multiple vulnerabilities found internally under a small number of CVEs.
simonw•6m ago
The Mozilla team responded to that argument here: https://hacks.mozilla.org/2026/05/behind-the-scenes-hardenin... - in the FAQ.
enlightenedfool•32m ago
Is this the God model that no one else can build? Unbelievable.
krisbolton•28m ago
There is independent research out there on frontier model security capability. AI Security Institute (UK) put out their paper comparing Mythos to other frontier models in early April. They've been tracking frontier model security capability since early 2023, so it's a decent dataset. https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos...
arjie•16m ago
The era where you could reputably believe things published by anyone on this front is over. If you want this information, you’re going to have to attempt it yourself with the Opus API. It is entirely possible that any released model access will be heavily guardrailed against hacking attempts and Mythos is just an unrailed model. It is entirely possible that Mythos is a different architecture or size. We can’t know from the outside.

There is also a pretty big risk that anyone who is not you would leak the answer to the test. We are close to n=1 epistemics here. You’re going to have to do the research yourself.

amusingimpala75•47m ago
[edit: TFA addresses this, though I still find crazy 90% accuracy overall vs 20% accuracy for curl]

Is this suspected vulns or actual vulns? If I recall correctly, it produced 5 for curl but only 1 was legit

RamRodification•43m ago
This is marketing. So probably suspected. Or somewhere in between.
Smaug123•41m ago
> So far, Mythos Preview has found what it estimates are 6,202 high- or critical-severity vulnerabilities in these projects (out of 23,019 in total, including those it estimates as medium- or low-severity).

> 1,752 of those high- or critical-rated vulnerabilities have now been carefully assessed by one of six independent security research firms, or in a small number of cases by ourselves. Of these, 90.6% (1,587) have proved to be valid true positives, and 62.4% (1,094) were confirmed as either high- or critical-severity. That means that even if Mythos Preview finds no further vulnerabilities, at our current post-triage true-positive rates, it’s on track to have surfaced nearly 3,900 high- or critical-severity vulnerabilities in open-source code

rbranson•34m ago
I don't know why you're getting downvoted. This is exactly what was reported by curl's creator under the section "Five findings became one": https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-v...
Smaug123•30m ago
I think it's more that the requested information is prominently featured in the article, and indeed is the content of the only graphic in the article below the intro banner.
extr•26m ago
Did you RTFA?
InsideOutSanta•26m ago
I wonder if it coincidentally becomes safe to release when compute capacity bought from SpaceX will provide enough headroom to let a lot more people run it.
0xAstro•19m ago
I had a fun day today where I had deepseek-v4-flash subagents work out patch for dirty frag for systems with AF_ALG disabled and nscd turned on, to gain root access. The original published exploit wasn't working but the patched one worked like a charm.

I am still a believer that a 100 subagents with good-enough intelligence can get same results as mythos, I am ready for this opinion to be shattered when I eventually try mythos and I believe others here must have tried mythos out too.

orangebread•13m ago
BOOO RELEASE THE MODEL ALREADY GAWD
mdeeks•12m ago
You can get a taste of this today yourself with Codex Security. I turned it on just as an experiment and in less than a week it has now become essential to all of us. I was shocked how accurate it is, how many security issues it found in existing code, how it continually finds them as we commit, and how NO ONE is immune from making these mistakes.

I'd say it is about 90% accurate for us. Often even the "Low" findings lead us to dig and realize it is actually exploitable. Everyone makes these mistakes, from the most junior to the most senior. They are just a class of bugs after all.

I expect tools like this to be a regular part of the development lifecycle from here on. We code with AI, we review with AI, we search for vulns with AI. Even if it isn't perfect, it is easily worth the cost IMHO. Highly recommend you get something enabled for your own repos ASAP