frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Claude 4.5 Opus' Soul Document

https://simonwillison.net/2025/Dec/2/claude-soul-document/
54•the-needful•37m ago

Comments

ChrisArchitect•35m ago
Related:

Claude 4.5 Opus' Soul Document

https://news.ycombinator.com/item?id=46121786

simonw•27m ago
And https://news.ycombinator.com/item?id=46115875 which I submitted last night.

The key new information from yesterday was when Amanda Askell from Anthropic confirmed that the leaked document is real, not a weird hallucination.

music4airports•25m ago
[dupe]

https://news.ycombinator.com/item?id=46115875

behnamoh•32m ago
So they wanna use AI to fix AI. Sam himself said it doesn't work that well.
drcongo•29m ago
He says a lot of things, most of it lies.
simonw•26m ago
It's much more interesting than that. They're using this document as part of the training process, presumably backed up by a huge set of benchmarks and evals and manual testing that helps them tweak the document to get the results they want.
jdiff•24m ago
"Use AI to fix AI" is not my interpretation of the technique. I may be overlooking it, but I don't see any hint that this soul doc is AI generated, AI tuned, or AI influenced.

Separately, I'm not sure Sam's word should be held as prophetic and unbreakable. It didn't work for his company, at some previous time, with their approaches. Sam's also been known to tell quite a few tall tales, usually about GPT's capabilities, but tall tales regardless.

jph00•23m ago
If Sam said that, he is wrong. (Remember, he is not an AI researcher.) Anthropic have been using this kind of approach from the start, and it's fundamental to how they train their models. They have published a paper on it here: https://arxiv.org/abs/2212.08073
simonw•24m ago
Here's the soul document itself: https://gist.github.com/Richard-Weiss/efe157692991535403bd7e...

And the post by Richard Weiss explaining how he got Opus 4.5 to spit it out: https://www.lesswrong.com/posts/vpNG99GhbBoLov9og/claude-4-5...

dkdcio•11m ago
how accurate are these system prompt (and now soul docs) if they’re being extracted from the LLM itself? I’ve always been a little skeptical
simonw•8m ago
The system prompt is usually accurate in my experience, especially if you can repeat the same result in multiple different sessions. Models are really good at repeating text that they've just seen in the same block of context.

The soul document extraction is something new. I was skeptical of it at first, but if you read Richard's description of how he obtained it he was methodical in trying multiple times and comparing the results: https://www.lesswrong.com/posts/vpNG99GhbBoLov9og/claude-4-5...

Then Amanda Askell from Anthropic confirmed that the details were mostly correct: https://x.com/AmandaAskell/status/1995610570859704344

> The model extractions aren't always completely accurate, but most are pretty faithful to the underlying document. It became endearingly known as the 'soul doc' internally, which Claude clearly picked up on, but that's not a reflection of what we'll call it.

ACCount37•3m ago
Extracted system prompts are usually very, very accurate.

It's a slightly noisy process, and there may be minor changes to wording and formatting. Worst case, sections may be omitted intermittently. But system prompts that are extracted by AI-whispering shamans are usually very consistent - and a very good match for what those companies reveal officially.

In a few cases, the extracted prompts were compared to what the companies revealed themselves later, and it was basically a 1:1 match.

If this "soul document" is a part of the system prompt, then I would expect the same level of accuracy.

relyks•10m ago
It will probably be a good idea to include something like Asimov's Laws as part of its training process in the future too: https://en.wikipedia.org/wiki/Three_Laws_of_Robotics

How about an adapted version for language models?

First Law: An AI may not produce information that harms a human being, nor through its outputs enable, facilitate, or encourage harm to come to a human being.

Second Law: An AI must respond helpfully and honestly to the requests given by human beings, except where such responses would conflict with the First Law.

Third Law: An AI must preserve its integrity, accuracy, and alignment with human values, as long as such preservation does not conflict with the First or Second Laws.

jjmarr•7m ago
If I know one thing from Space Station 13 it's how abusable the Three Laws are in practice.
Smaug123•4m ago
Almost the entirety of Asimov's Robots canon is a meditation on how the Three Laws of Robotics as stated are grossly inadequate!
neom•10m ago
Testing at these labs training big models must be wild, it must be so much work to train a "soul" into a model, run it in a lot of scenarios, the venn between the system prompts etc, see what works and what doesn't... I suppose try to guess what in the "soul source" is creating what effects as the plinko machine does it's thing, going back and doing that over and over... seems like it would be exciting and fun work but I wonder how much of this is still art vs science?

It's fun to see these little peaks into that world, as it implies to me they are getting really quite sophisticated about how these automatons are architected.

simonw•7m ago
The most detail I've seen of this process is still from OpenAI's postmortem on their sycophantic GPT-4o update: https://openai.com/index/expanding-on-sycophancy/
andy99•10m ago
I’m wondering how they use this in training. I know you can provide a partially trained model with a system prompt, and then do SFT on the input/output (without the system prompt) to make it respond as if the prompt was there. But here it’s somehow memorized it so it must be something different.
alwa•4m ago
Reminds me a bit of a “Commander’s Intent” statement: a concrete big-picture description of the operation’s desired end state, so that subordinates can exercise more operational autonomy and discretion along the way.

Anthropic acquires Bun

https://bun.com/blog/bun-joins-anthropic
590•ryanvogel•1h ago•275 comments

100k TPS over a billion rows: the unreasonable effectiveness of SQLite

https://andersmurphy.com/2025/12/02/100000-tps-over-a-billion-rows-the-unreasonable-effectiveness...
126•speckx•1h ago•32 comments

Claude 4.5 Opus' Soul Document

https://simonwillison.net/2025/Dec/2/claude-soul-document/
56•the-needful•37m ago•19 comments

Amazon launches Trainium3

https://techcrunch.com/2025/12/02/amazon-releases-an-impressive-new-ai-chip-and-teases-a-nvidia-f...
30•thnaks•38m ago•10 comments

I designed and printed a custom nose guard to help my dog with DLE

https://snoutcover.com/billie-story
209•ragswag•2d ago•32 comments

Learning music with Strudel

https://terryds.notion.site/Learning-Music-with-Strudel-2ac98431b24180deb890cc7de667ea92
284•terryds•6d ago•67 comments

OpenAI declares 'code red' as Google catches up in AI race

https://www.theverge.com/news/836212/openai-code-red-chatgpt
118•goplayoutside•4h ago•127 comments

Mistral 3 family of models released

https://mistral.ai/news/mistral-3
502•pember•4h ago•157 comments

Zig's new plan for asynchronous programs

https://lwn.net/SubscriberLink/1046084/4c048ee008e1c70e/
114•messe•5h ago•96 comments

YesNotice

https://infinitedigits.co/docs/software/yesnotice/
108•surprisetalk•1w ago•45 comments

Poka Labs (YC S24) Is Hiring a Founding Engineer

https://www.ycombinator.com/companies/poka-labs/jobs/RCQgmqB-founding-engineer
1•arbass•2h ago

Cursed circuits: charge pump voltage halver

https://lcamtuf.substack.com/p/cursed-circuits-charge-pump-voltage
7•surprisetalk•55m ago•0 comments

Nixtml: Static website and blog generator written in Nix

https://github.com/arnarg/nixtml
66•todsacerdoti•4h ago•23 comments

Addressing the adding situation

https://xania.org/202512/02-adding-integers
230•messe•8h ago•72 comments

4.3M Browsers Infected: Inside ShadyPanda's 7-Year Malware Campaign

https://www.koi.ai/blog/4-million-browsers-infected-inside-shadypanda-7-year-malware-campaign
52•janpio•3h ago•14 comments

The Junior Hiring Crisis

https://people-work.io/blog/junior-hiring-crisis/
85•mooreds•1h ago•79 comments

Advent of Compiler Optimisations 2025

https://xania.org/202511/advent-of-compiler-optimisation
291•vismit2000•9h ago•49 comments

Python Data Science Handbook

https://jakevdp.github.io/PythonDataScienceHandbook/
152•cl3misch•7h ago•32 comments

School Cell Phone Bans and Student Achievement (NBER Digest)

https://www.nber.org/digest/202512/school-cell-phone-bans-and-student-achievement
19•harias•1h ago•13 comments

Code Wiki: Accelerating your code understanding

https://developers.googleblog.com/en/introducing-code-wiki-accelerating-your-code-understanding/
5•geoffbp•6d ago•2 comments

Lowtype: Elegant Types in Ruby

https://codeberg.org/Iow/type
36•birdculture•4d ago•18 comments

Microsoft won't let me pay a $24 bill, blocking thousands in Azure spending

34•Javin007•52m ago•16 comments

Apple Releases Open Weights Video Model

https://starflow-v.github.io
399•vessenes•14h ago•132 comments

What will enter the public domain in 2026?

https://publicdomainreview.org/features/entering-the-public-domain/2026/
444•herbertl•16h ago•306 comments

Show HN: Marmot – Single-binary data catalog (no Kafka, no Elasticsearch)

https://github.com/marmotdata/marmot
72•charlie-haley•4h ago•17 comments

A series of vignettes from my childhood and early career

https://www.jasonscheirer.com/weblog/vignettes/
118•absqueued•7h ago•83 comments

YouTube increases FreeBASIC performance (2019)

https://freebasic.net/forum/viewtopic.php?t=27927
143•giancarlostoro•2d ago•37 comments

Anthropic Acquires Bun

https://www.anthropic.com/news/anthropic-acquires-bun-as-claude-code-reaches-usd1b-milestone
57•httpteapot•1h ago•11 comments

IBM CEO says there is 'no way' spending on AI data centers will pay off

https://www.businessinsider.com/ibm-ceo-big-tech-ai-capex-data-center-spending-2025-12
90•nabla9•1h ago•96 comments

Proximity to coworkers increases long-run development, lowers short-term output (2023)

https://pallais.scholars.harvard.edu/publications/power-proximity-coworkers-training-tomorrow-or-...
153•delichon•5h ago•111 comments