frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Claude Opus 4.7 Model Card

https://anthropic.com/claude-opus-4-7-system-card
94•adocomplete•2h ago

Comments

bicepjai•2h ago
This card is a 272 page report. So now we are redefining names :)
albert_e•2h ago
Does the model card fit in the model's context :)
anonyfox•8m ago
well it will saturate your 5h limit window at least
jmward01•2h ago
Haiku not getting an update is becoming telling. I suspect we are reaching a point where the low end models are cannibalizing high end and that isn't going to stop. How will these companies make money in a few years when even the smallest models are amazing?
blixt•2h ago
Isn't it pretty common for the smaller models to release a little while after the bigger ones, for all the big model providers?
jmward01•2h ago
The last update for Haiku was in October, or in startup land, 10 years ago.
dkhenry•2h ago
The Gemma models are at this point. A 31B model that can fit on a consumer card is as good as Sonnet 4.5. I haven't put it through as much on the coding front or tool calling as I have the Claude or GPT models, but for text processing it is on par with the frontier models.
make3•1h ago
absolutely not on par you're smoking
lostmsu•1h ago
Just to be clear, did you notice the parent said 4.5?
cmorgan31•1h ago
They are also on par in a lot of classification tasks. I did have to actually use gemma4 and fine tune it a bit but that is part of the value add.
dkhenry•1h ago
You make a compelling argument, but thankfully I have data to back up my anecdotal experience

This comparison shows them neck and neck https://benchlm.ai/compare/claude-sonnet-4-5-vs-gemma-4-31b

As Does this one https://llm-stats.com/models/compare/claude-sonnet-4-6-vs-ge...

And the pelican benchmark even shows them pretty close https://simonwillison.net/2026/Apr/2/gemma-4/ https://simonwillison.net/2025/Sep/29/claude-sonnet-4-5/

Also this isn't a fringe statement, you can see most people who have done an evaluation agree with me

mvkel•2h ago
It seems to be a rule that older models are more expensive than newer ones. The low end models have higher $CPT and worse output. I wonder if the move is to just have one model and quantize if you hit compute constraints
deaux•43m ago
> It seems to be a rule that older models are more expensive than newer ones.

It isn't. Gemini has gotten more expensive with each release. Anthropic has stayed pretty similar over time, no? When is the last time OpenAI dropped API prices? OpenAI started very high because they were the first, so there was a ton of low hanging fruit and there was much room to drop.

koehr•2h ago
This reads more like an advertisement for Mythos, on the first glance
ModernMech•29m ago
That's why I hate these "model cards" as if they are some sort of technical document -- they're marketing materials.
100ms•2h ago

    $ pbpaste | wc -w 
    62508
    $ pbpaste | grep -oi mythos|wc -w
    331
    $ pbpaste | grep -oi opus|wc -w
    809
aliljet•1h ago
Have they effectively communicated what a 20x or 10x Claude subscription actually means? And with Claude 4.7 increasing usage by 1.35x does that mean a 20x plan is now really a 13x plan (no token increase on the subscription) or a 27x plan (more tokens given to compensate for more computer cost) relative to Claude Opus 4.6?
computomatic•1h ago
They have communicated it as 5x is 5 x Pro, and 20x is 20 x Pro (I haven’t looked lately so not sure if that’s changed).

They have also repeatedly communicated that the base unit (Pro allotment) is subject to change and does change often.

As far as I can tell, that implies there is no guarantee that those subscriptions get some specific number of tokens per unit of time. It’s not a claim they make.

DonsDiscountGas•49m ago
Definitely 13x, at least for now
STRiDEX•1h ago
Dumb question but why are chemical weapons always addressed as a risk with llms? Is the idea that they contain how to make chemical weapons or that they would guide someone on how?

Would there not already be websites that contain that information? How is an llm different, i guess, from some sort of anarchist cookbook thing.

CodingJeebus•1h ago
WAG but I wonder if a hijacked LLM could also assist with figuring out how to obtain required materials, not just provide the recipe.
Philpax•1h ago
Both. There's the risk of them instructing a user on how to produce a known formulation (the Anarchist Cookbook solution, as you say), which is irritating but not that problematic.

The bigger issue is that they are potentially capable of producing novel formulations capable of producing harm, and guiding someone through this process. That is, consider a world in which someone with malicious desires has access to a model as capable at chemistry / biology as Mythos is at offensive cybersecurity abilities.

This is obviously limited by the fact that the models don't operate in the physical world, but there's plenty of written material out there.

rogerrogerr•1h ago
The world has been blessed by two connected things:

1. Smart people have economic opportunities that align them away from being evil

2. People who are evil tend not to be smart.

We're breaking both of these assumptions.

Der_Einzige•1h ago
Good. This is how we will force the world to reckon with the isolated, the disgruntled, and "lone wolf" terrorist. Real "sigma males" actually exist, and when they decide "society has to pay" we are all worse off for it. If Ted Kaczynski (quintessential example of a real actual sigma) had been in his prime operating right now, he'd have mail-bombed NeurIPS and ICLR already. I'm not cool with being in crowds of AI professionals right now for physical security reasons given the extreme anti-AI sentiment that exists from nearly everyone outside of the valley: https://jonready.com/blog/posts/everyone-in-seattle-hates-ai...
chrisweekly•55m ago
"Smart people have economic opportunities that align them away from being evil"

For some definition of evil, some of the time, ok. But as economic opportunities compound (looking at the behavior of the ultra-rich), it seems there's at least strong correlation in the other direction, if not full-on "root of all evil" causation.

rogerrogerr•41m ago
Sure, but that’s not “slaughter a stadium of people with drones” evil or “poison the water supply” evil or “take out unprotected electrical substations” evil.

So much infrastructure is very soft because the evil people aren’t smart enough to conceive of or conduct an attack.

malcolmgreaves•12m ago
That’s not quite true. Take a look at all the billionaires destroying society. Being evil is the surest way to get to get rich. In fact it’s the only way to amass that level of capital: there’s no ethical billionaire.
rgbrenner•1h ago
In the same way that all coding docs are available publicly
dcre•41m ago
LLMs can tell you exactly how to acquire the materials and manufacture the materials. They might even come up with novel formulations that rely on substances that are easier to get. There might be information about this stuff online but LLMs are much better than random idiots at adapting that information to their actual situation.

On top of LLMs reducing the cost/difficulty, the other reason biological and chemical weapons are such a worry is their asymmetric character — they are much much easier and cheaper to produce and deploy than they are to defend against.

joeumn•1h ago
I'm actually surprised at how it performed compared to 4.6 and also compared to mythos. Will be fun to use.
il-b•1h ago
Ironically, the website is down
Symmetry•1h ago
> The technical error that caused accidental chain-of-thought supervision in some prior models (including Mythos Preview) was also present during the training of Claude Opus 4.7, affecting 7.8% of episodes.

>_>

bachittle•1h ago
So Opus 4.7 is measurably worse at long-context retrieval compared to Opus 4.6. Opus 4.6 scores 91.9% and Opus 4.7 scores 59.2%. At least they're transparent about the model degradation. They traded long-context retrieval for better software engineering and math scores.
freedomben•46m ago
Agreed, I appreciate the transparency (and Anthropic isn't normally very transparent). It's also great to know because I will change how I approach long contexts knowing it struggles more with them.
RobinL•42m ago
Could this be because they've found the 1m context uneconomical (ie costs too much to serve, or burns through users quota too quickly causing complaints), and so they're no longer targeting it as a goal
jzig•42m ago
At what point along the 1M window does context become "long" enough that this degradation occurs?
daemonologist•22m ago
The benchmark GP mentioned is measuring at 128k-256k context (there's another at 524k-1024k, where 4.6 scored 78.3% and 4.7 scored 32.2%).

The longer the context the worse the performance; there isn't really a qualitative step change in capability (if there is imo it happens at like 8k-16k tokens, much sooner than is relevant for multi-turn coding tasks - see e.g. this old benchmark https://github.com/adobe-research/NoLiMa ).

film42•36m ago
To be honest, I think it's just a more honest score of what Opus 4.6 actually was. Once contexts get sufficiently large, Opus develops pretty bad short term memory loss.
NickNaraghi•1h ago
232 pages is bullshit. Longer than the Mythos system card? What are you hiding.
nothinkjustai•39m ago
How much do you want to bet this is Mythos, and Anthropic released it as Opus to avoid embarrassment after all the hype they whipped up…
deflator•33m ago
Model Welfare? Are they serious about this? Or is it just more hype? I really don't trust anything this company says anymore. "We have a model that is too dangerous to release" is like me saying that I have a billion dollars in gold that nobody is allowed to see but I expect to be able to borrow against it.
kube-system•10m ago
> Chemical and biological weapons threat model 2 (CB-2): Novel chemical/biological weapons production capabilities. A model has CB-2 capabilities if it has the ability to significantly help threat actors (for example, moderately resourced expert-backed teams) create/obtain and deploy chemical and/or biological weapons with potential for catastrophic damages far beyond those of past catastrophes such as COVID-19.

That's an interesting choice of benchmark for measuring the risk of "Chemical and biological weapons"

Cloudflare Email Service

https://blog.cloudflare.com/email-for-agents/
257•jilles•3h ago•111 comments

Claude Opus 4.7

https://www.anthropic.com/news/claude-opus-4-7
668•meetpateltech•2h ago•521 comments

Mozilla Thunderbolt

https://www.thunderbolt.io/
224•dabinat•3h ago•192 comments

We gave an AI a 3 year retail lease and asked it to make a profit

https://andonlabs.com/blog/andon-market-launch
63•lukaspetersson•1h ago•71 comments

Qwen3.6-35B-A3B: Agentic Coding Power, Now Open to All

https://qwen.ai/blog?id=qwen3.6-35b-a3b
458•cmitsakis•3h ago•236 comments

IPv6 traffic crosses the 50% mark

https://www.google.com/intl/en/ipv6/statistics.html?yzh=28197
648•Aaronmacaron•1d ago•440 comments

Launch HN: Kampala (YC W26) – Reverse-Engineer Apps into APIs

https://www.zatanna.ai/kampala
24•alexblackwell_•1h ago•21 comments

Put your SSH keys in your TPM chip

https://raymii.org/s/tutorials/Put_your_SSH_keys_in_your_TPM_chip.html
30•type0•4d ago•23 comments

Cloudflare's AI Platform: an inference layer designed for agents

https://blog.cloudflare.com/ai-platform/
102•nikitoci•3h ago•22 comments

The future of everything is lies, I guess: Where do we go from here?

https://aphyr.com/posts/420-the-future-of-everything-is-lies-i-guess-where-do-we-go-from-here
242•aphyr•3h ago•228 comments

Show HN: MacMind – A transformer neural network in HyperCard on a 1989 Macintosh

https://github.com/SeanFDZ/macmind
51•hammer32•3h ago•11 comments

Darkbloom – Private inference on idle Macs

https://darkbloom.dev
409•twapi•12h ago•197 comments

AI cybersecurity is not proof of work

https://antirez.com/news/163
139•surprisetalk•6h ago•60 comments

The paper computer

https://jsomers.net/blog/the-paper-computer
220•jsomers•3d ago•66 comments

Show HN: CodeBurn – Analyze Claude Code token usage by task

https://github.com/AgentSeal/codeburn
6•agentseal•2d ago•0 comments

Six Characters

https://ajitem.com/blog/iron-core-part-2-six-characters/
25•Airplanepasta•3d ago•1 comments

Codex Hacked a Samsung TV

https://blog.calif.io/p/codex-hacked-a-samsung-tv
145•campuscodi•6h ago•80 comments

FSF trying to contact Google about spammer sending 10k+ mails from Gmail account

https://daedal.io/@thomzane/116410863009847575
297•pabs3•13h ago•179 comments

Laravel raised money and now injects ads directly into your agent

https://techstackups.com/articles/laravel-raised-money-and-now-injects-ads-directly-into-your-agent/
120•mooreds•2h ago•66 comments

Claude Opus 4.7 Model Card

https://anthropic.com/claude-opus-4-7-system-card
95•adocomplete•2h ago•43 comments

Modern Microprocessors – A 90-Minute Guide

https://www.lighterra.com/papers/modernmicroprocessors/
127•Flex247A•4d ago•15 comments

€54k spike in 13h from unrestricted Firebase browser key accessing Gemini APIs

https://discuss.ai.google.dev/t/unexpected-54k-billing-spike-in-13-hours-firebase-browser-key-wit...
333•zanbezi•4h ago•236 comments

ChatGPT for Excel

https://chatgpt.com/apps/spreadsheets/
287•armcat•19h ago•175 comments

Ancient DNA reveals pervasive directional selection across West Eurasia [pdf]

https://reich.hms.harvard.edu/sites/reich.hms.harvard.edu/files/inline-files/2026_Akbari_Nature_s...
50•Metacelsus•6h ago•38 comments

PHP 8.6 Closure Optimizations

https://wiki.php.net/rfc/closure-optimizations
56•moebrowne•2d ago•8 comments

RamAIn (YC W26) Is Hiring

https://www.ycombinator.com/companies/ramain/jobs/bwtwd9W-founding-gtm-operations-lead
1•svee•9h ago

Cybersecurity looks like proof of work now

https://www.dbreunig.com/2026/04/14/cybersecurity-is-proof-of-work-now.html
515•dbreunig•1d ago•194 comments

Artifacts: Versioned storage that speaks Git

https://blog.cloudflare.com/artifacts-git-for-agents-beta/
30•jgrahamc•3h ago•2 comments

RedSun: System user access on Win 11/10 and Server with the April 2026 Update

https://github.com/Nightmare-Eclipse/RedSun
143•airhangerf15•13h ago•37 comments

North American English Dialects

https://aschmann.net/AmEng/
113•skogstokig•13h ago•65 comments