AI Cybersecurity After Mythos: The Jagged Frontier

https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier

12•evelinag•2d ago

Comments

baq•2d ago

> TL;DR: We tested Anthropic Mythos's showcase vulnerabilities on small, cheap, open-weights models. They recovered much of the same analysis. AI cybersecurity capability is very jagged: it doesn't scale smoothly with model size, and the moat is the system into which deep security expertise is built, not the model itself. Mythos validates the approach but it does not settle it yet.

Notably, Kimi K2 and GPT-OSS-120b do quite well when provided with the isolated context. Article seems to be heavily LLM-assisted, but the content itself is good.

1970-01-01•1d ago

I'm awaiting general release so I can root and jailbreak some old Android/iphones. If it succeeds, I'm a fan. If it fails, then it's obviously not a leap, it's another step.

Cluelessidoit•1d ago

This is actually a solid test

cedws•1d ago

In my experience asking OpenAI or Anthropic models to do anything FAANG doesn’t want you to do is usually rejected. For example reverse engineering an app, cracking your own device, etc…

tao_oat•1d ago

> Our tests gave models the vulnerable function directly, often with contextual hints (e.g., "consider wraparound behavior").

"Often with contextual hints" is doing some heavy lifting here, IMO. I agree with the article's premise -- you don't need Mythos to use AI to find novel, complex vulnerabilities -- but these results as presented are somewhat misleading.

akavel•1d ago

AFAIU, their claim is that Mythos is in reality used in a framework that builds such contextual hints, and that their (Aisle's) own framework does the same:

"(...) a well-designed scaffold naturally produces this kind of scoped context through its targeting and iterative prompting stages, which is exactly what both AISLE's and Anthropic's systems do."

cyanydeez•1d ago

All evidence is point to LLMs not being sufficient for the taks everyone want them to do. That harness and agentic capabilities that shove them through JSON-shaped holes are utterly necessary and along with all the security, that there's no great singularity happening here.

The current tech is a sigmoid and even using the abilities of the AI, novelty, improvements don't appear to be happening at any exponetial pace.

tao_oat•1d ago

> The current tech is a sigmoid

What makes you say that? I'm only asking because the data I've seen looks pretty cleanly exponential still, e.g. https://metr.org.

Gen Z workers who fear AI will take their job actively sabotaging its rollout

Tabularis: A lightweight, cross-platform database client. Hackable with pkugins

Flowyble Studio – Run Claude, Copilot and Codex Side-by-Side

The Munro Lecture with Adam Tooze – April 8 2026 [video]

Energy-Based Models Is All You Need

The Infinity Man: Demis Hassabis, Colleagues and Rivals

Mark's Magic Multiply

Give Your Agent a Canvas, Not Just a Chatbox

The Great GPU Shortage: H100 Rental Prices Up 40%

Due Diligence Framework Before Your Business Commits to Open Source

Show HN: I missed my terminal so I rebuilt email

AIYO Wisper – Local voice-to-text for macOS (WhisperKit, open source)

What We Learned Building a Rust Runtime for TypeScript

Rust terminal projects in 3 years

How AI Is Reimagining the Game of Golf–For Both Players and Courses

The tragedy of leisure

State of Utopia passes its first law

EU fingerprint and photo travel rules come into force

Umeshism

Artemis II is competency porn

Why you need to replace your native macOS screenshot app?

Automated Browser Testing with MCP

Kaze Emanuar: Illegal 3D Rendering Techniques (N64) [video]

Picasso's Guernica (Gigapixel)

Show HN: LSM Trees: MemTable, Compaction, and the Amplification Triangle [video]

France's government is ditching Windows for Linux, says US tech a strategic risk

Reverse Engineering File Format Steganography Chain of the TeamPCP Attack

GazeFollow from Scratch

Incremental Compilation with LLVM

I built a skill manager for AI agents. The agents install the skills themselves