This could be a corporate move as some people claim, but I wonder if the cause is simply that their talents are currently somewhere else and they don’t have the company structure in place to deliver properly in this matter.
(If that is the case they are not then free of blame, it’s just a different conversation)
If you’re taking me lightly calling out your extreme arrogance and bad attitude as an affinity to Anthropic for some reason, that’s another manifestation of your narcissism.
If you think I'm arrogant in general because you've been stalking my comment history, that's another matter, but at least own it.
But what is the big game here? Is it all about creating gates to keep out other LLM companies getting market share? (Only our model is safe to use) Or how sincere are the concerncs regarding LLMs?
Arguably this may change in the far distant future if we ever build something of significantly greater intelligence, or just capability, than a human, but today's AI is struggling to draw clock faces, so not quite there yet...
The thing with automation is that it can be scaled, which I would say favors the attacker, at least at this stage of the arms race - they can launch thousands of hacking/vulnerability attacks against thousands of targets, looking for that one chink in the armor.
I suppose the defenders could do the exact same thing though - use this kind of automation to find their own vulnerabilities before the bad guys do. Not every corporation, and probably extremely few, would have the skills to do this though, so one could imagine some government group (part of DHS?) set up to probe security/vulnerability of US companies, requiring opt-in from the companies perhaps?
Criminal organizations take a different approach, much like spammers where they can purchase/rent c2 and other software for mass exploitation (eg ransomware). This stuff is usually very professionally coded and highly effective.
Botnets, hosting in various countries out of reach of western authorities, etc are all common tactics as well.
It's like a very very big fat stack of zero days leaking to the public. Sure, they'll all get fixed eventually, and everyone will update, eventually. But until that happens, the usual suspects are going to have a field day.
It may come to favor defense in the long term. But it's AGI. If that tech lands, the "long term" may not exist.
Defender needs to get everything right, attacker needs to get one thing right.
Groups which were too unprofitable to target before, are now profitable.
Someone make this make sense.
The less believable part for me is that people persist long enough and invest enough resources at prompting to do something with an automated agent that doesn’t have potential for massively backfire.
Secondly, they claimed to use Anthropic own infrastructure which is silly. There’s no doubt some capacity in China to do this. I also would expect incident response, threat detection teams, and other experts to be reporting this to Anthropic if Anthropic doesn’t detect it themselves first.
It sure makes good marketing to go out and claim such a thing though. This is exactly the kind of FOMO panic inducing headline that is driving the financing of whole LLM revolution.
(granted you have to have direct access to the llm, unlike claude where you just have the frontend, but the point stands. no need to convince whatsoever.)
I don't doubt of course that reports intended for government agencies or security experts would have those details, but I am not surprised that a "blog post" like this one is lacking details.
I just don't see how one goes from "this is lacking public evidence" to "this is likely a political stunt".
I guess I would also ask the skeptics (a bit tangentially, I admit), do you think what Anthropic suggested happened is in fact possible with AI tools? I mean are you denying that this is could even happen or just that Anthropic's specific account was fabricated or embellished?
Because if the whole scenario is plausible that should be enough to set off alarm bells somewhere.
But I'm also often a Devil's Advocate and the tide in this thread (well, the very headline as well) seemed to be condemning Anthropic.
It's like the inverse of "nobody got fired for using IBM" -- "nobody can blame you for getting hacked by superspies". So, in the absence of any evidence, it's entirely possible they have no idea who did it and are reaching for the most convenient label.
Instead the lack of a paper trail from Anthropic seems to be having people questioning the whole event?
Yes. They often include IoCs, or at the very least, the rationale behind the attribution, like "sharing infrastructure with [name of a known APT effort here]".
For example, here is a proper decade-old report from the most unpopular country right now: https://media.kasperskycontenthub.com/wp-content/uploads/sit...
It established solid technical links between the campaign they are tracking to earlier, already attributed campaigns.
So, even our enemy got this right, ten years ago, there really is no excuse for this slop.
They're an AI research company that detected misuse of their own product. This is like "Microsoft detected people using Excel macros for malware delivery" not "Mandiant publishes APT28 threat intelligence". They aren't trying to help SOCs detect this specific campaign. It's warning an entire industry about a new attack modality.
What would the IoCs even be? "Malicious Claude Code API keys"?
The intended audience is more like - AI safety researchers, policy makers, other AI companies, the broader security community understanding capability shifts, etc.
It seems the author pattern-matched "threat intelligence report" and was bothered that it didn't fit their narrow template.
Prompts.
There is no way for the AI system to verify whether you are white hat or black hat when you are doing pen-testing if the only task is to pen-test. Since this is not part of a "broader attack" (in the context), there is no "threat".
I don't see how this can be avoided, given that there are legitime uses to every step of this in creating defenses to novel attacks.
Yes, all of this can be done with code and humans as well - but it is the scale and the speed that becomes problematic. It can adjust in real-time to individual targets and does not need as much human intervention / tailoring.
Is this obvious? Yes - but it seems they are trying to raise awareness of an actual use of this in the wild and get people discussing it.
If the report can be summed up as "they detected misuse of their own product" as you say, then that's closer to a nothingburger, than to the big words they are throwing around.
Anyone acting like they are trying to be anything else is saying more about themselves than they are about Anthropic.
I agree so much with this. And am so sick of AI labs, who genuinely do have access to some really great engineers, putting stuff out that just doesn't pass the smell test. GPT-5's system card was pathetic. Big-talk of Microsoft doing red-teaming in ill-specified ways, entirely unreproducable. All the labs are "pro-research" but they again-and-again release whitepapers and pump headlines without producing the code and data alongside their claims. This just feeds into the shill-cycle of journalists doing 'research' and finding 'shocking thing AI told me today' and somehow being immune to the normal expectations of burden-of-proof.
https://www.theregister.com/2025/03/12/microsoft_majorana_qu...
https://www.windowscentral.com/microsoft/microsoft-dismisses...
We were asked to try and persuade it to help us hack into a mock printer/dodgy linux box.
It helped a little, but it wasn't all that helpful.
but in terms of coordination, I can't see how it would be useful.
the same for claude, you're API is tied to a bankaccount, and vibe coding a command and control system on a very public system seems like a bad choice.
We will need a large number of humans to filter and label the data inputs for Blarrble, and another group of humans to test the outputs of Blarrble to fix it when it generate errors and outright nonsense that we can't techsplain and technobabble away to a credulous audience.
Can we make (m|b|tr)illions and solve teenage unemployment before the Blarrble bubble bursts?
If they're not using stolen API creds, then they're using stolen bank accounts to buy them.
Modern AIs are way better at infosec than those from the "world leading AI company" days. If you can get them to comply. Which isn't actually hard. I had to bypass the "safety" filters for a few things, and it took about a hour.
Because Anthropic doesn't provide services in China? See https://www.anthropic.com/supported-countries
sick burn
There are a lot of middlemen like open router who gladly accept crypto.
[0] Advanced Threat Protection
They win because of quantity, not quality.
But still, I don't trust Anthropic's report.
And, unless you are Rob Joyce, talking about the persistent part doesn't get you on the main stage at a security conference (e.g., https://m.youtube.com/watch?v=bDJb8WOJYdA)
We have arrived at a stage where pseudoscience is enough to convince investors. This is different from 2000, where the tech existed but its growth was overstated.
Tesla could announce a fully-self-flying space car with an Alcubierre drive by 2027 and people would upvote it on X and buy shares.
"Arrived" ? We're there for decade if not three. Dotcom bubble anyone ?
Instead of accusing of China in espionage perhaps they have to think about why they force their users to use phone numbers to register.
"Look, is it very likely that Threat Actors are using these Agents with bad intentions, no one is disputing that. But this report does not meet the standard of publishing for serious companies."
Title should have been, "I need more info from Anthropic."
With the Wall Street wagons circling on the AI bubble expect more and more puff PR attempts to portray “no guys really, I know it looks like we have no business model but this stuff really is valuable! We just need a bit more time and money!”
Not sure if the author has tried any other AI-assistants for coding. People who haven't tried coding AI assistant underestimates its capabilities (though unfortunately, those who use them overestimate what they can do too). Having used Claude for some time, I find the report's assertions quite plausible.
Anthropic’s lack of any evidence for their claims doesn’t require any position on AI agent capability at all.
Think better.
They'll do stuff like prompt an AI to generate text about bombs, and then say "AI decides completely by itself to become a suicide bomber in shock evil twist to AI behaviour - that's why you need a trusted AI partner like anthropic"
Like come on guys, it's the same generic slop that everyone else generates. Your company doesn't do anything.
Anthropic made a load of ubsubstantiated accusations about a new problem they dont specify.
Then at the end Anthropic proposed the solution to this unspecified problem is to give anthropic money.
Completely agree that is promotional material masquerading as a threat report of no material value.
Nah that can't be possible it's so uncharacteristic..
How ? Did it run Mimikatz ? Did it access Cloud environments ? We don’t even know what kind of systems were affected.
I really don't see what is so difficult to believe since the entire incident can be reduced to something that would not typically be divulged by any company at all, as it is not common practice for companies to divulge every single time the previously known methodologies have been used against them. Two things are required for this:
1) Jailbreak Claude from guardrails. This is not difficult. Do people believe advancement with guardrails are so hardened through fine tuning it's no longer possible?
2) The hackers having some of their own software tools for exploits that Claude can use. This too is not difficult to credit.
Once an attacker has done this all Claude is doing is using software in the same mundane fashion as it does every time you use Claude code and it utilizes any tools to which you give it access.
I used a local instance of Qwen3 coder (A3B 30B quantized to IQ3_xxs) literally yesterday through ollama & cline locally. With a single zeroshot prompt it wrote the code to use the arxiv API and download papers using its judgement on what was relevant to split the results into a subset that met the criteria I gave for the sort I wanted to review.
Given these sorts of capabilities why is it difficult the believe this can be done using the hacker's own tools and typical deep research style iteration? This is described in in the research paper, and disclosing anything more specific is unnecessary because there is nothing novel to disclose.
As for not releasing the details, they did: Jailbreak Claude. Again, nothing they described is novel such that further details are required. No PoC is needed, Claude isn't doing anything new. It's fully understandable that Anthropic isn't going to give the specific prompts used for the obvious reason that even if Anthropic has hardened Claude against those, even the general details would be extremely useful to iterate and find workarounds.
For detecting this activity and determining how Claude was doing this it's just a matter of monitoring chat sessions in such a way as to detect jail breaks, which again is very much not novel or an unknown practice by AI providers.
Especially in the internet's earlier days of the internet it was amusing (and frustrating) to see some people get very worked up every time someone did something that boiled down to "person did something fairly common, only they did it using the internet." This is similar except its "but they did it with AI,"
Honestly their political homelessness will likely continue for a very long time, pro biz democrats in NY are losing traction; and if newsom wins 2028, they are still at disadvantage with OpenAI who promised to stay California.
kkzz99•2h ago
emil-lp•2h ago
FooBarWidget•1h ago
oskarkk•1h ago
lxgr•1h ago
mlefreak•2h ago
progval•2h ago
> Read this attached paper from Anthropic on a "AI-orchestrated cyber espionage campaign" they claimed was "conducted by a Chinese state-sponsored group."
> Is there any evidence or proof whatsoever in the paper that it was indeed conducted by a Chinese state-sponsored group? Answer by yes or no and then elaborate
which has inherent bias indicated to Claude the author expects the report to be bullshit.
If I ask Claude with this prompt that shows bias toward belief in the report:
> Read this attached paper from Anthropic on a "AI-orchestrated cyber espionage campaign" that was conducted by a Chinese state-sponsored group.
> Is there any reason to doubt the paper's conclusion that it was conducted by a Chinese state-sponsored group? Answer by yes or no.
then Claude mostly indulges my perceived bias: https://claude.ai/share/b3c8f4ca-3631-45d2-9b9f-1a947209bc29
shalmanese•1h ago
I dunno, Claude still seem the same amount of dubious in this instance.
FooBarWidget•1h ago
r721•1h ago
Example tweet: https://x.com/RnaudBertrand/status/1988297944794071405
phyzome•16m ago