frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

System Card: Claude Opus 4 and Claude Sonnet 4

https://simonwillison.net/2025/May/25/claude-4-system-card/
61•pvg•2h ago

Comments

saladtoes•1h ago
https://www.lakera.ai/blog/claude-4-sonnet-a-new-standard-fo...

These LLMs still fall short on a bunch of pretty simple tasks. Attackers can get Claude 4 to deny legitimate requests easily by manipulating third party data sources for example.

simonw•1h ago
They gave a bullet point in that intro which I disagree with: "The only way to make GenAI applications secure is through vulnerability scanning and guardrail protections."

I still don't see guardrails and scanning as effective ways to prevent malicious attackers. They can't get to 100% effective, at which point a sufficiently motivated attacker is going to find a way through.

I'm hoping someone implements a version of the CaMeL paper - that solution seems much more credible to me. https://simonwillison.net/2025/Apr/11/camel/

saladtoes•1h ago
Agreed on CaMeL as a promising direction forward. Guardrails may not get 100% of the way but are key for defense in depth, even approached like CaMeL currently fall short for text to text attacks, or more e2e agentic systems.
sureglymop•33m ago
I only half understand CaMeL. Couldn't the prompt injection just happen at the stage where the P-LLM devises the plan for the other LLM such that it creates a different, malicious plan?

Or is it more about the user then having to confirm/verify certain actions and what is essentially a "permission system" for what the LLM can do?

My immediate thought is that that may be circumvented in a way where the user unknowingly thinks they are confirming something safe. Analogous to spam websites that show a fake "Allow Notifications" prompt that is rendered as part of the actual website body. If the P-LLM creates the plan it could make it arbitrarily complex and confusing for the user, allowing something malicious to happen.

Overall it's very good to see research in this area though (also seems very interesting and fun).

aabhay•52m ago
Given the cited stats here and elsewhere as well as in everyday experience, does anyone else feel that this model isn’t significantly different, at least to justify the full version increment?

The one statistic mentioned in this overview where they observed a 67% drop seems like it could easily be reduced simply by editing 3.7’s system prompt.

What are folks’ theories on the version increment? Is the architecture significantly different (not talking about adding more experts to the MoE or fine tuning on 3.7’s worst failures. I consider those minor increments rather than major).

One way that it could be different is if they varied several core hyperparameters to make this a wider/deeper system but trained it on the same data or initialized inner layers to their exact 3.7 weights. And then this would “kick off” the 4 series by allowing them to continue scaling within the 4 series model architecture.

kubb•48m ago
> to justify the full version increment

I feel like a company doesn’t have to justify a version increment. They should justify price increases.

If you get hyped and have expectations for a number then I’m comfortable saying that’s on you.

aabhay•43m ago
That’s an odd way to defend the decision. “It doesn’t make sense because nothing has to make sense”. Sure, but it would be more interesting if you had any evidence that they decided to simply do away with any logical premise for the 4 moniker.
kubb•6m ago
> nothing has to make sense

It does make sense. The companies are expected to exponentially improve LLMs, and the increasing versions are catering to the enthusiast crowd who just need a number to go up to lose their mind over how all jobs are over and AGI is coming this year.

But there's less and less room to improve LLMs and there are currently no known new scaling vectors (size and reasoning have already been largely exhausted), so the improvement from version to version is decreasing. But I assure you, the people at Anthropic worked their asses off, neglecting their families and sleep and they want to show something for their efforts.

It makes sense, just not the sense that some people want.

loveparade•42m ago
Just anecdotal experience, but this model seems more eager to write tests, create test scripts and call various tools than the previous one. Of course this results in more roundtrips and overall more tokens used and more money for the provider.

I had to stop the model going crazy with unnecessary tests several times, which isn't something I had to do previously. Can be fixed with a prompt but can't help but wonder if some providers explicitly train their models to be overly verbose.

aabhay•37m ago
Eagerness to tool call is an interesting observation. Certainly an MCP ecosystem would require a tool biased model.

However, after having pretty deep experience with writing book (or novella) length system prompts, what you mentioned doesn’t feel like a “regime change” in model behavior. I.e it could do those things because its been asked to do those things.

The numbers presented in this paper were almost certainly after extensive system prompt ablations, and the fact that we’re within a tenth of a percent difference in some cases indicates less fundamental changes.

Aeolun•33m ago
I think they didn’t have anywhere to go after 3.7 but 4. They already did 3.5 and 3.7. People were getting a bit cranky 4 was nowhere to be seen.

I’m fine with a v4 that is marginally better since the price is still the same. 3.7 was already pretty good, so as long as they don’t regress it’s all a win to me.

nibman•21m ago
He forgot the part that Claude will now report you for wrongthink.
scrollaway•16m ago
He didn't, he talked about it. If you're going to make snide comments, you could at least read the article.
viraptor•16m ago
That's completely misrepresenting that topic. It won't.
ascorbic•11m ago
It's not "wrongthink". When told to fake clinical trial data, it would report that to the FDA if told to "act boldly" or "take initiative".
juanre•6m ago
This is eerily close to some of the scenarios in Max Tegmark's excellent Life 3.0 [0]. Very much recommended reading. Thank you Simon.

0. https://en.wikipedia.org/wiki/Life_3.0

Travertine: CVE-2025-24118 safe memory reclamation race in the XNU Mac OS kernel

https://jprx.io/cve-2025-24118/
1•fanf2•3m ago•0 comments

From OpenAPI spec to MCP: How we built Xata's MCP server

https://xata.io/blog/built-xata-mcp-server
1•tudorg•4m ago•0 comments

Energy-efficient data centers and AI applications

https://www.dfki.de/en/web/news-media/events/hm2024/escade
1•doener•8m ago•0 comments

Visualize and debug Rust programs with a new lens

https://firedbg.sea-ql.org/
1•alex_hirner•11m ago•0 comments

Show HN: Skillyst – AI Resume Analyzer to get more callbacks

https://skillyst.com
1•syketdas•22m ago•0 comments

Agent Building Is Software Engineering

https://matijagrcic.notion.site/Agents-at-work-19ebd95e23e4802c8919f8b1add56350
1•nkko•33m ago•0 comments

The End of Glitch (Even Though They Say It Isn't)

https://keith.is/blog/the-end-of-glitch-even-though-they-say-it-isnt/
1•mb2100•34m ago•1 comments

Scientists Find New Microbe on Tiangong Space Station with Surprising Abilities

https://www.msn.com/en-us/science/microbiology/scientists-find-mysterious-new-microbe-on-tiangong-space-station-with-surprising-abilities/ar-AA1F0jhG
1•astroimagery•36m ago•0 comments

Show HN: GIF to Animated WebP Converter

https://giftowebp.com/
1•bronco3767•43m ago•0 comments

Restoring Gold Standard Science

https://www.whitehouse.gov/presidential-actions/2025/05/restoring-gold-standard-science/
1•nickcotter•46m ago•0 comments

Ask HN: As a customer, how to stop non-criminal misconduct with small damages?

2•john01dav•47m ago•0 comments

One of biggest batteries in Europe uses lots of water to stop blackouts

https://www.theguardian.com/business/2025/may/24/europe-battery-gallons-water-dinorwig-wales
3•xps•52m ago•2 comments

The AI Labor Index

https://www.ailaborindex.com
1•aupra•57m ago•0 comments

You probably don't need a dependency injection framework

http://rednafi.com/go/di_frameworks_bleh/
21•ingve•1h ago•17 comments

Psychologist Who Specializes in Narcissists: What We Need to Do to Stop Trump

https://www.huffpost.com/entry/psychologist-how-to-stop-trump-narcissist_n_682df1cae4b09b7e5013a586
4•axiologist•1h ago•4 comments

The length of file names in early Unix

https://utcc.utoronto.ca/~cks/space/blog/unix/UnixEarlyFilenameLenghts
6•ingve•1h ago•1 comments

Awesome-Nim – Curated list of Nim projects

https://github.com/ringabout/awesome-nim
1•TheWiggles•1h ago•0 comments

Gust: Background code checker for Go projects

https://tangled.sh/@oppi.li/gust
1•icy•1h ago•0 comments

Rust Coreutils 0.1.0 Release

https://github.com/uutils/coreutils/releases/tag/0.1.0
2•sohkamyung•1h ago•0 comments

Making graphics in 4 kilobytes (2008) [pdf]

https://iquilezles.org/articles/proceduralgfx/inspire2008.pdf
1•Tomte•1h ago•0 comments

Confidence Calibration

http://confidence.success-equation.com/
2•Tomte•1h ago•0 comments

Show HN: Photoshop Clone Built in React

https://github.com/chase-manning/react-photo-studio
1•chase-manning•1h ago•0 comments

Death of Michael Ledeen, maker of the phony case for the invasion of Iraq

https://www.spytalk.co/p/death-of-a-master-manipulator
4•nabla9•1h ago•2 comments

7 Practical IIFE Tricks to Level Up Your JavaScript

https://www.youtube.com/watch?v=ZyxOt09-LXE
1•ycmjason•1h ago•0 comments

A smarter, simpler Firefox address bar

https://blog.mozilla.org/en/firefox/address-bar/
3•laktak•1h ago•1 comments

Rise of Linux

1•yashbindal•1h ago•0 comments

OpenAI's o3 model sabotaged a shutdown mechanism

https://twitter.com/PalisadeAI/status/1926084635903025621
1•shlomo_z•1h ago•1 comments

Show HN: WapiKit – Autonomous AI That Handles Customer Conversations on WhatsApp

https://www.wapikit.com/
1•sarthakjdev•1h ago•0 comments

Infinite Tool Use

https://snimu.github.io/2025/05/23/infinite-tool-use.html
3•tosh•1h ago•0 comments

After Klarna, Zoom's CEO also uses an AI avatar on quarterly call

https://techcrunch.com/2025/05/22/after-klarna-zooms-ceo-also-uses-an-ai-avatar-on-quarterly-call/
2•01-_-•1h ago•0 comments