frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Claude 4.5 Opus' Soul Document

https://simonwillison.net/2025/Dec/2/claude-soul-document/
49•the-needful•33m ago

Comments

ChrisArchitect•31m ago
Related:

Claude 4.5 Opus' Soul Document

https://news.ycombinator.com/item?id=46121786

simonw•23m ago
And https://news.ycombinator.com/item?id=46115875 which I submitted last night.

The key new information from yesterday was when Amanda Askell from Anthropic confirmed that the leaked document is real, not a weird hallucination.

music4airports•21m ago
[dupe]

https://news.ycombinator.com/item?id=46115875

behnamoh•28m ago
So they wanna use AI to fix AI. Sam himself said it doesn't work that well.
drcongo•25m ago
He says a lot of things, most of it lies.
simonw•22m ago
It's much more interesting than that. They're using this document as part of the training process, presumably backed up by a huge set of benchmarks and evals and manual testing that helps them tweak the document to get the results they want.
jdiff•21m ago
"Use AI to fix AI" is not my interpretation of the technique. I may be overlooking it, but I don't see any hint that this soul doc is AI generated, AI tuned, or AI influenced.

Separately, I'm not sure Sam's word should be held as prophetic and unbreakable. It didn't work for his company, at some previous time, with their approaches. Sam's also been known to tell quite a few tall tales, usually about GPT's capabilities, but tall tales regardless.

jph00•19m ago
If Sam said that, he is wrong. (Remember, he is not an AI researcher.) Anthropic have been using this kind of approach from the start, and it's fundamental to how they train their models. They have published a paper on it here: https://arxiv.org/abs/2212.08073
simonw•21m ago
Here's the soul document itself: https://gist.github.com/Richard-Weiss/efe157692991535403bd7e...

And the post by Richard Weiss explaining how he got Opus 4.5 to spit it out: https://www.lesswrong.com/posts/vpNG99GhbBoLov9og/claude-4-5...

dkdcio•7m ago
how accurate are these system prompt (and now soul docs) if they’re being extracted from the LLM itself? I’ve always been a little skeptical
simonw•4m ago
The system prompt is usually accurate in my experience, especially if you can repeat the same result in multiple different sessions. Models are really good at repeating text that they've just seen in the same block of context.

The soul document extraction is something new. I was skeptical of it at first, but if you read Richard's description of how he obtained it he was methodical in trying multiple times and comparing the results: https://www.lesswrong.com/posts/vpNG99GhbBoLov9og/claude-4-5...

Then Amanda Askell from Anthropic confirmed that the details were mostly correct: https://x.com/AmandaAskell/status/1995610570859704344

> The model extractions aren't always completely accurate, but most are pretty faithful to the underlying document. It became endearingly known as the 'soul doc' internally, which Claude clearly picked up on, but that's not a reflection of what we'll call it.

relyks•7m ago
It will probably be a good idea to include something like Asimov's Laws as part of its training process in the future too: https://en.wikipedia.org/wiki/Three_Laws_of_Robotics

How about an adapted version for language models?

First Law: An AI may not produce information that harms a human being, nor through its outputs enable, facilitate, or encourage harm to come to a human being.

Second Law: An AI must respond helpfully and honestly to the requests given by human beings, except where such responses would conflict with the First Law.

Third Law: An AI must preserve its integrity, accuracy, and alignment with human values, as long as such preservation does not conflict with the First or Second Laws.

jjmarr•3m ago
If I know one thing from Space Station 13 it's how abusable the Three Laws are in practice.
neom•6m ago
Testing at these labs training big models must be wild, it must be so much work to train a "soul" into a model, run it in a lot of scenarios, the venn between the system prompts etc, see what works and what doesn't... I suppose try to guess what in the "soul source" is creating what effects as the plinko machine does it's thing, going back and doing that over and over... seems like it would be exciting and fun work but I wonder how much of this is still art vs science?

It's fun to see these little peaks into that world, as it implies to me they are getting really quite sophisticated about how these automatons are architected.

simonw•3m ago
The most detail I've seen of this process is still from OpenAI's postmortem on their sycophantic GPT-4o update: https://openai.com/index/expanding-on-sycophancy/
andy99•6m ago
I’m wondering how they use this in training. I know you can provide a partially trained model with a system prompt, and then do SFT on the input/output (without the system prompt) to make it respond as if the prompt was there. But here it’s somehow memorized it so it must be something different.

The Myth of the $140k Poverty Line

https://www.thefp.com/p/the-myth-of-the-140000-poverty-line
1•paulpauper•37s ago•0 comments

How much of "Mississippi's education miracle" is an artifact of selection bias?

https://statmodeling.stat.columbia.edu/2025/12/01/how-much-of-mississippis-education-miracle-is-a...
1•paulpauper•2m ago•0 comments

Trump Accounts Are a Big Deal

https://marginalrevolution.com/marginalrevolution/2025/07/trump-accounts-are-a-big-deal.html
1•paulpauper•2m ago•0 comments

Show HN: Prima Veritas – Deterministic Analytics Engine for Reproducible ML

https://github.com/bryanziehl/prima-veritas
1•MLoffshore•5m ago•1 comments

Disable-Javascript.org

https://xn--gckvb8fzb.com/disable-javascript-org/
1•speckx•6m ago•0 comments

Saudi Fund to Own Almost All of Electronic Arts After Buyout

https://www.wsj.com/business/deals/saudi-fund-to-own-almost-all-of-electronic-arts-after-buyout-6...
2•JumpCrisscross•7m ago•1 comments

Kiro Autonomous Agent

https://kiro.dev/autonomous-agent/
1•janpio•8m ago•0 comments

Gel Joins Vercel

https://www.geldata.com/blog/gel-joins-vercel
5•jakubmazanec•8m ago•0 comments

How AI is transforming work at Anthropic

https://www.anthropic.com/research/how-ai-is-transforming-work-at-anthropic
2•mfiguiere•10m ago•0 comments

It's hard to deny the simple beauty of a function

https://www.jameskerr.blog/posts/2025/mostly-functions/
1•jameskerr•11m ago•0 comments

YouTube AI: Deepfake detector tool is alarming creators, experts

https://www.cnbc.com/2025/12/02/youtube-ai-biometric-data-creator-deepfake.html
2•thunderbong•11m ago•1 comments

Books, Bombs, and Battlefield Needs: Rethinking Development in Myanmar

https://insightmyanmar.org/insight-myanmar-blog/2025/5/5/books-bombs-and-battlefield-needs-rethin...
1•arunc•11m ago•0 comments

Government of Canada AI Register (Minimum Viable Product)

https://open.canada.ca/data/en/dataset/fcbc0200-79ba-4fa4-94a6-00e32facea6b
2•palidanx•16m ago•0 comments

Why We Love Functional Programming But Don't Use Effect-TS

https://runharbor.com/blog/2025-11-24-why-we-dont-use-effect-ts
2•18nleung•16m ago•0 comments

AWS announces new capabilities for its AI agent builder

https://techcrunch.com/2025/12/02/aws-announces-new-capabilities-for-its-ai-agent-builder/
1•janpio•17m ago•0 comments

ABC's from Space

https://www.visibleearth.nasa.gov/collection/1619/abcs-from-space
3•merusame•18m ago•0 comments

Tree-planting robot saves burned land from deforestation

https://www.designboom.com/technology/tree-planting-robot-saves-burned-land-deforestation-seedlin...
2•thunderbong•20m ago•1 comments

Show HN: Sigma Runtime ERI – 800-line open cognitive runtime for LLM continuity

https://github.com/sigmastratum/documentation/blob/main/runtime/reference/README.md
1•teugent•21m ago•0 comments

Small numbers of Notepad++ users reporting security woes

https://doublepulsar.com/small-numbers-of-notepad-users-reporting-security-woes-371d7a3fd2d9
3•speckx•24m ago•0 comments

Ask HN: What has been your experience with Agentic Coding?

3•grandimam•24m ago•0 comments

Who Is Beej?

https://gist.github.com/ontouchstart/0e2cefda95e78d1b7330660f1a414804
2•ontouchstart•27m ago•1 comments

Amazon Trainium3 UltraServers

https://www.aboutamazon.com/news/aws/trainium-3-ultraserver-faster-ai-training-lower-cost
1•ChrisArchitect•28m ago•0 comments

Amazon to use Nvidia tech in AI chips, roll out new servers

https://www.reuters.com/business/retail-consumer/amazon-use-nvidia-tech-ai-chips-roll-out-new-ser...
1•ohong•30m ago•0 comments

Building Ghostable and Finding Ideas by Listening Well

https://ghostable.dev/blog/building-ghostable-finding-ideas-by-listening-well
1•joerucci•31m ago•0 comments

Language Translation: An Useful AI

https://newslttrs.com/language-translation-an-actually-useful-ai/
1•spzb•31m ago•0 comments

LowType: Intuitive Ruby runtime type checking

https://github.com/low-rb/low_type
2•taikon•32m ago•0 comments

Claude 4.5 Opus' Soul Document

https://simonwillison.net/2025/Dec/2/claude-soul-document/
53•the-needful•33m ago•18 comments

Amazon launches Trainium3

https://techcrunch.com/2025/12/02/amazon-releases-an-impressive-new-ai-chip-and-teases-a-nvidia-f...
28•thnaks•34m ago•9 comments

Smart Pitmaster – A reverse-scheduling calculator for smoking meat

https://smartpitmaster.com/
1•NateShenner•35m ago•1 comments

RAG Isn't One-Size-Fits-All - Here's how to Tune It

https://lancedb.com/blog/rag-isnt-one-size-fits-all/
4•scosman•37m ago•0 comments