Anthropic Drops Flagship Safety Pledge

https://time.com/7380854/exclusive-anthropic-drops-flagship-safety-pledge/

85•cwwc•2h ago

Comments

ggsp•2h ago

It was always a matter of time

dhruv3006•2h ago

Anthropic facing a lot of flak recently.

esafak•1h ago

It must be due to pressure from the Defense Dept:

The AI startup has refused to remove safeguards that would prevent its technology from being used to target weapons autonomously and conduct U.S. domestic surveillance.

Pentagon officials have argued the government should only be required to comply with U.S. law. During the meeting, Hegseth delivered an ultimatum to Anthropic: get on board or the government would take drastic action, people familiar with the matter said.

https://www.staradvertiser.com/2026/02/24/breaking-news/anth...

instagib•21m ago

They probably have proof in contracts that they agreed to this usage. They won’t alter the deal based on some bad press nor do they want to lose the DoD-DoW as a customer.

mhitza•1h ago

The IPOs this year can't come soon enough https://tomtunguz.com/spacex-openai-anthropic-ipo-2026/

SirensOfTitan•1h ago

What an interesting week to drop the safety pledge.

This is how all of these companies work. They’ll follow some ethical code or register as a PBC until that undermined profits.

These companies are clearly aiming at cheapening the value of white collar labor. Ask yourself: will they steward us into that era ethically? Or will they race to transfer wealth from American workers to their respective shareholders?

SilverElfin•1h ago

This is terrible. It’s caving in to the Trump administration threatening to ban Anthropic from government contracts. It really cements how authoritarian this administration is and how dangerous they can be.

bbatsell•1h ago

This headline unfortunately offers more smoke than light. This article has nothing to do with the current tête-à-tête with the Pentagon. It is discussing one specific change to Anthropic's "Responsible Scaling Policy" that the company publicly released today as version "3.0".

ameliaquining•1h ago

I consider this a bigger deal than the Pentagon thing.

ruszki•41m ago

> This article has nothing to do with the current tête-à-tête with the Pentagon.

The article yes, but we cannot be sure about its topic. We definitely cannot claim that they are unrelated. We don't know. It's possible that the two things have nothing to do with each other. It's also possible that they wanted to prevent worse requests and this was a preventive measure.

tbrownaw•25m ago

This is something they've been working on "in recent months". The Pentagon thing was today.

This cannot have been caused by that, unless they've also invented time travel.

chris_money202•1h ago

First they rushed a model to market without safety checks, and I said nothing. It wasn't my field.

Then they ignored the researchers warning about what it could do, and I said nothing. It sounded like science fiction.

Then they gave it control of things that matter, power grids, hospitals, weapons, and I said nothing. It seemed to be working fine.

Then something went wrong, and no one knew how to stop it, no one had planned for it, and no one was left who had listened to the warnings.

hsbauauvhabzb•1h ago

Plenty of people have said plenty. The problem isn’t the warnings, it’s that people are too stupid and greedy to think about the long term impacts.

ashtonshears•44m ago

The societal ills from collective tendancy to ignore red flags seems to be a human trait

zer00eyz•42m ago

> Then something went wrong, and no one knew how to stop it,

This is the problem with every AI safety scenario like this. It has a level of detachment from reality that is frankly stark.

If linesman stop showing up to work for a week, the power goes out. The US has show that people with "high powered" rifles can shut down the grid.

We are far far away from a sort of world where turning AI off is a problem. There isnt going to be a HAL or Terminator style situation when the world is still "I, Pencil".

A lot of what safety amounts to is politics (National, not internal, example is Taiwan a country). And a lot more of it is cultural.

jimmydoe•1h ago

Either be a company in capitalist USA, or keep being your safety queen. You just can’t be both.

The intention to start these pledge and conflict with DOW might be sincere, but I don’t expect it to last long, especially the company is going public very soon.

crossroadsguy•1h ago

I just want Apple and Linux to offer ASAP:

1. Extremely granular ways to let user control network and disk access to apps (great if resource access can also be changed)

2. Make it easier for apps as well to work with these

3. I would be interested in knowing how adding a layer before CLI/web even gets the query OS/browser can intercept it and could there be a possibility of preventing harm before hand or at least warning or logging for say someone who overviews those queries later?

And most importantly — all these via an excellent GUI with clear demarcations and settings and we’ll documented (Apple might struggle with documentation; so LLMs might help them there)

My point is — why the hell are we waiting for these companies to be good folks? Why not push them behind a safety layer?

I mean CLI asks .. can I access this folder? Run this program? Download this? But they can just do that if they want! Make them ask those questions like apps asks on phones for location, mic, camera access.

m132•16m ago

Indeed, the world would be a much nicer place if only firewalls and Unix permissions existed...

ChrisArchitect•1h ago

Hegseth gives Anthropic until Friday to back down on AI safeguards

https://news.ycombinator.com/item?id=47140734

dbg31415•30m ago

They made it until Tuesday! They stood tall as long as they could! =P

goranmoomin•57m ago

TBH I am sad that Anthropic is changing its stance, but in the current world, if you even care about LLM safety, I feel that this is the right choice — there’s too many model providers and they probably don’t consider safety as high priority as Anthropic. (Yes that might change, they can get pressurized by the govt, yada yada, but they literally created their own company because of AI safety, I do think they actually care for now)

If we need safety, we need Anthropic to be not too far behind (at least for now, before Anthropic possibly becomes evil), and that might mean releasing models that are safer and more steerable than others (even if, unfortunately, they are not 100% up to Anthropic’s goals)

Dogmatism, while great, has its time and place, and with a thousand bad actors in the LLM space, pragmatism wins better.

ashtonshears•41m ago

Do you work at Anthropic, or know people who do?

I genuinly curious why they are so holy to you, when to me I see just another tech company trying to make cash

Edit: Reading some of the linked articles, I can see how Anthropic CEO is refusing to allow their product for warfare (killing humans), which is probably a good thing that resonates with supporting them

Art9681•53m ago

Of course the US is going to do this and of course its in Anthropics best interest to comply. Right now China is flooding HuggingFace with models that will inevitably have this capability. Right now there are hundreds of models being hosted that have been deliberately processed to remove refusals and their safety training. Everyone who keeps up with this knows about it. HF knows about it. And it is pretty obvious that those open weight models will be deployed in intelligence and defense. It is certain that not just China, but many nations around the world with the capital to host a few powerful servers to run the top open weight models are going to use them for that capability.

The narrative on social media, this site included, is to portray the closed western labs as the bad guys and the less capable labs releasing their distilled open weight models to the world as the good guys.

Right now a kid can go download an Abliterated version of a capable open weight model and they can go wild with it.

But let's worry about what the US DoD is doing or what the western AI companies absolutely dominating the market are doing because that's what drives engagement and clicks.

heftykoo•51m ago

Ah, the classic AI startup lifecycle:

We must build a moat to save humanity from AI.

Please regulate our open-source competitors for safety.

Actually, safety doesn't scale well for our Q3 revenue targets.

dbg31415•31m ago

So they'll cave for Trump.

Will they cave for China, Israel, Myanmar, North Korea... etc?

"It's OK, guys! It's legal to set the ICE kill bots to shoot anyone who isn't white enough."

Disgusting.

tbrownaw•18m ago

> committed to never train an AI system unless it could guarantee in advance that the company’s safety measures were adequate

That doesn't even make sense.

What stops one model from spouting wrongthink and suicide HOWTOs might not work for a different model, and fine-tuning things away uses the base model as a starting point.

You don't know the thing's failure modes until you've characterized it, and for LLMs the way you do that is by first training it and then exercising it.

brikym•15m ago

Don't be evil.

rvz•11m ago

Unsurprising.

Code has always been the easy part

I'm helping my dog vibe code games

Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3

Mac mini will be made at a new facility in Houston

Justifying Text-Wrap: Pretty

Pi – A minimal terminal coding harness

Amazon accused of widespread scheme to inflate prices across the economy

Mercury 2: The fastest reasoning LLM, powered by diffusion

Hacking an old Kindle to display bus arrival times

Nearby Glasses

I pitched a roller coaster to Disneyland at age 10 in 1978

Corgi Labs (YC W23) Is Hiring

Hugging Face Skills

Optophone

Show HN: Emdash – Open-source agentic development environment

Aesthetics of single threading

Anthropic Drops Flagship Safety Pledge

Looks like it is happening

Stripe valued at $159B, 2025 annual letter

Show HN: Recursively apply patterns for pathfinding

We installed a single turnstile to feel secure

IRS Tactics Against Meta Open a New Front in the Corporate Tax Fight

We are changing our developer productivity experiment design

Build Your Own Forth Interpreter

IDF killed Gaza aid workers at point blank range in 2025 massacre: Report

OpenAI, the US government and Persona built an identity surveillance machine

Why the KeePass format should be based on SQLite

Steel Bank Common Lisp

US Military leaders meet with Anthropic to argue against Claude safeguards

Ask HN: Programmable Watches with WiFi?

Anthropic Drops Flagship Safety Pledge

Comments

Code has always been the easy part

I'm helping my dog vibe code games

Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3

Mac mini will be made at a new facility in Houston

Justifying Text-Wrap: Pretty

Pi – A minimal terminal coding harness

Amazon accused of widespread scheme to inflate prices across the economy

Mercury 2: The fastest reasoning LLM, powered by diffusion

Hacking an old Kindle to display bus arrival times

Nearby Glasses

I pitched a roller coaster to Disneyland at age 10 in 1978

Corgi Labs (YC W23) Is Hiring

Hugging Face Skills

Optophone

Show HN: Emdash – Open-source agentic development environment

Aesthetics of single threading

Anthropic Drops Flagship Safety Pledge

Looks like it is happening

Stripe valued at $159B, 2025 annual letter

Show HN: Recursively apply patterns for pathfinding

We installed a single turnstile to feel secure

IRS Tactics Against Meta Open a New Front in the Corporate Tax Fight

We are changing our developer productivity experiment design

Build Your Own Forth Interpreter

IDF killed Gaza aid workers at point blank range in 2025 massacre: Report

OpenAI, the US government and Persona built an identity surveillance machine

Why the KeePass format should be based on SQLite

Steel Bank Common Lisp

US Military leaders meet with Anthropic to argue against Claude safeguards

Ask HN: Programmable Watches with WiFi?