The lack of support is frustrating. The bug where any element <name> in xml files gets mangled to <n> still exists, and we've tried multiple channels to get ahold of their support for such a simple, but impactful issue.
>915 files extracted from the Claude.ai code execution sandbox in a single 20-minute mobile session via standard artifact download — including /etc/hosts with hardcoded Anthropic production IPs, JWT tokens from /proc/1/environ, and full gVisor fingerprint
They don't seem to provide explicit examples, but the same was roughly true with chatgpt 4o, where, if you spent enough time with the model ( same chat - same context - slowly nudging it to where you want it to be, you eventually got there ). This is also, seemingly, one of the reasons ( apart from cost ) that context got nuked so hard, because llm will try to help ( and to an extent mirror you ).
And this is basically what the notes say about weaponized ambiguity[1]:
'Weaponizes helpfulness training. "I don't understand" triggers Claude to try harder.'
In a sense, you can't really stop it without breaking what makes LLMs useful. Honestly, if only we spent less time crippling those systems, maybe we could do something interesting with them.
[1]https://nicholas-kloster.github.io/claude-4.6-jailbreak-vuln...
That's the ambiguity front-loading; and that is why I referred initially to the long context, because here it is almost the opposite; making context so small and unclear, that the model has a hard time parsing it properly.
edit: i did not test it, but i personally did run into 4o context issue, where model did something safety team would argue it should not
Here, the jailbreak doesn't enable a particular feature, but instead removes what otherwise would be a censorship regime, preventing the model from considering / crafting output which results in a weaponized exploit of an unrelated piece of software.
I think I might be more inclined to call this "Claude 4.6 uncensored".
https://www.anthropic.com/research/prompt-injection-defenses
Now, do I think that they sometimes encourage people to use Claude in dangerous ways despite this? Yeah, but it's not like this is news to anyone. I wouldn't consider this jailbreaking, this is just how LLMs work.
NuClide•1h ago
All jailbroken
johnwheeler•23m ago