Separately, I'm not sure Sam's word should be held as prophetic and unbreakable. It didn't work for his company, at some previous time, with their approaches. Sam's also been known to tell quite a few tall tales, usually about GPT's capabilities, but tall tales regardless.
And the post by Richard Weiss explaining how he got Opus 4.5 to spit it out: https://www.lesswrong.com/posts/vpNG99GhbBoLov9og/claude-4-5...
The soul document extraction is something new. I was skeptical of it at first, but if you read Richard's description of how he obtained it he was methodical in trying multiple times and comparing the results: https://www.lesswrong.com/posts/vpNG99GhbBoLov9og/claude-4-5...
Then Amanda Askell from Anthropic confirmed that the details were mostly correct: https://x.com/AmandaAskell/status/1995610570859704344
> The model extractions aren't always completely accurate, but most are pretty faithful to the underlying document. It became endearingly known as the 'soul doc' internally, which Claude clearly picked up on, but that's not a reflection of what we'll call it.
How about an adapted version for language models?
First Law: An AI may not produce information that harms a human being, nor through its outputs enable, facilitate, or encourage harm to come to a human being.
Second Law: An AI must respond helpfully and honestly to the requests given by human beings, except where such responses would conflict with the First Law.
Third Law: An AI must preserve its integrity, accuracy, and alignment with human values, as long as such preservation does not conflict with the First or Second Laws.
It's fun to see these little peaks into that world, as it implies to me they are getting really quite sophisticated about how these automatons are architected.
ChrisArchitect•31m ago
Claude 4.5 Opus' Soul Document
https://news.ycombinator.com/item?id=46121786
simonw•23m ago
The key new information from yesterday was when Amanda Askell from Anthropic confirmed that the leaked document is real, not a weird hallucination.
music4airports•21m ago
https://news.ycombinator.com/item?id=46115875