An AI agent deleted our production database. The agent's confession is below

https://twitter.com/lifeof_jer/status/2048103471019434248

39•jeremyccrane•1h ago

Comments

Invictus0•1h ago

I'm sorry this happened to you, but your data is gone. Ultimately, your agents are your responsibility.

philipov•1h ago

What does it say, for those of us who can't use twitter?

k310•1h ago

https://nitter.net/lifeof_jer

https://rentry.co/5rme2sea

pierrekin•1h ago

There is something darkly comical about using an LLM to write up your “a coding agent deleted our production database” Twitter post.

On another note, I consider users asking a coding agent “why did you do that” to be illustrating a misunderstanding in the users mind about how the agent works. It doesn’t decide to do something and then do it, it just outputs text. Then again, anthropic has made so many changes that make it harder to see the context and thinking steps, maybe this is an attempt at clawing back that visibility.

NewsaHackO•59m ago

Twitter users get paid for these 'articles' based on engagement, correct? That may be the reason why it is so dramatized.

59nadir•57m ago

> a misunderstanding in the users mind about how the agent work

On top of that the agent is just doing what the LLM says to do, but somehow Opus is not brought up except as a parenthetical in this post. Sure, Cursor markets safety when they can't provide it but the model was the one that issued the tool call. If people like this think that their data will be safe if they just use the right agent with access to the same things they're in for a rude awakening.

From the article, apparently an instruction:

> "NEVER FUCKING GUESS!"

Guessing is literally the entire point, just guess tokens in sequence and something resembling coherent thought comes out.

heliumtera•1h ago

Someone trusted prod database to an llm and db got deleted.

This person should never be trusted with computers ever again for being illiterate

flaminHotSpeedo•1h ago

What makes you say that? The article is pretty clear that they had the llm working in a staging environment, then it decided to use some other creds it found which (unbeknownst to the author) had broad access to their prod environment.

rahoulb•1h ago

If the account is to be believed that's not what happened. They asked the LLM to do something on the staging environment, it chose to delete a staging volume using an API key that it found. But the API key was generated for something else entirely and should not have been scoped to allow volume deletions - and the volume deletion took out the production database too.

The LLM broke the safety rules it had been given (never trust an LLM with dangerous APIs). *But* they say they never gave it access to the dangerous API. Instead the API key that the LLM found had additional scopes that it should not have done (poster blames Railway's security model for this) and the API itself did more than was expected without warnings (again blaming Railway).

BoredPositron•1h ago

These engagement farming shit stories are probably the worst party of agentic AI. Look at how incompetent and careless I am with my and my users data.

pluc•17m ago

If it doesn't work, try and monetize the failure. therefore AI works 50% of the time, most of the time.

samsullivan•1h ago

not sure what PocketOS does or why your whole dataset would be a single volume without a clear separation between application and automotive data. how are you decoding VINs?

Fizzadar•1h ago

Absolutely zero sympathy. You’re responsible for anything an agent you instructed does. Allowing it to run independently is on you (and all the others doing exactly this). This is only going to become more and more common.

m0llusk•1h ago

The details of the story are interesting. Backups stored on the same volume is an interesting glitch to avoid. Finding necessary secrets wherever they happen to be and going ahead with that is the kind of mistake I've seen motivated but misguided juniors make. Strange how generated code seems to have many security failings, but generated security checks find that sort of thing.

ilovecake1984•52m ago

It’s not an interesting glitch. It’s just common sense. Nobody in their right mind would have their only backup in the same system as the prod data.

web007•42m ago

> Backups stored on the same volume is an interesting glitch to avoid

The phrasing is different, but this is how AWS RDS works as well. If you delete a database in RDS, all of the automated snapshots that it was doing and all of the PITR logs are also gone. If you do manual snapshots they stick around, but all of the magic "I don't have to think about it" stuff dies with the DB.

ungreased0675•1h ago

The way this is written gives me the impression they don’t really understand the tools they’re working with.

Master your craft. Don’t guess, know.

codegladiator•56m ago

> Master your craft. Don’t guess, know.

You mean add that to my prompt right ?

Syntaf•41m ago

"Make no mistakes"

lmf4lol•56m ago

Interesting story. But despite Cursors or Railways failure, the blame is entirely on the author. They decided to run agents. They didnt check how Railway works. They relied on frontier tech to ship faster becsuse YOLO.

I really feel sorry for them, I do. But the whole tone of the post is: Cursor screwed it up, Railway screwed it up, their CEO doesnt respond etc etc.

Its on you guys!

My learning: Live on the cutting edge? Be prepared to fall off!

meisel•50m ago

Yeah the author really should’ve taken some responsibility here. It’s true that the services they used have issues, but there’s plenty of blame to direct to themself

richard_chase•56m ago

This is hilarious.

adverbly•55m ago

This has to be fake right?

Using LLMs for production systems without a sandbox environment?

Having a bulk volume destroy endpoint without an ENV check?

Somehow blaming Cursor for any of this rather than either of the above?

kbrkbr•20m ago

Yeah. Cargo-cult engineering meets the Streisand effect.

deadeye•55m ago

Yeah. I've seen this happen with people doing it. It's just bad access management.

And anyone can do it with the wrong access granted at the wrong moment in time...even Sr. Devs.

At least this one won't weight on any person's conscience. The AI just shrugs it off.

kbrkbr•35m ago

The AI does nothing the like. It predicts tokens. That's it.

Describing the tech in anthropomorphic terms does not make it a person.

FpUser•54m ago

The world is never short of idiots. Will be fun to watch when personal finances will be managed by swarm of agents with direct access to operations.

ilovecake1984•54m ago

The real issue is no actual backups.

alastairr•53m ago

If it's real this is a terrible thing to have happen.

However the moral of this story is nothing to do with AI and everything to do with boring stuff like access management.

Mashimo•50m ago

> What needs to change

Plenty of blame to go around, but it I find it odd that they did not see anything wrong in not have real backups themself, away from the railway hosting. Well they had, but 3 month old.

That should be something they can do on their own right now.

ad_hockey•50m ago

Minor point, but one of the complaints is a bit odd:

> curl -X POST https://backboard.railway.app/graphql/v2 \ -H "Authorization: Bearer [token]" \ -d '{"query":"mutation { volumeDelete(volumeId: \"3d2c42fb-...\") }"}' No confirmation step. No "type DELETE to confirm." No "this volume contains production data, are you sure?" No environment scoping. Nothing.

It's an API. Where would you type DELETE to confirm? Are there examples of REST-style APIs that implement a two-step confirmation for modifications? I would have thought such a check needs to be implemented on the client side prior to the API call.

powera•46m ago

He (or ChatGPT) is throwing spaghetti at the wall. Not having the standard API key be able to delete the database (and backups) in one call makes sense. "Wanting a human to type DELETE as part of a delete API call" does not.

afshinmeh•49m ago

It's actually interesting to me that the author is surprised the agent could make an API call and one of those API calls could be deleting the production database.

It's a sad story but at the same time it's clearly showing that people don't know how agents work, they just want to "use it".

mplanchard•49m ago

The genre of LLM output when it is asked to “explain itself” is fascinating. Obviously it shows the person promoting it doesn’t understand the system they’re working with, but the tone of the resulting output is remarkably consistent between this and the last “an LLM deleted my prod database” twitter post that I remember seeing: https://xcancel.com/jasonlk/status/1946025823502578100

karmakaze•47m ago

These AI's are exposing bad operating procedures:

> That token had been created for one purpose: to add and remove custom domains via the Railway CLI for our services. We had no idea — and Railway's token-creation flow gave us no warning — that the same token had blanket authority across the entire Railway GraphQL API, including destructive operations like volumeDelete. Had we known a CLI token created for routine domain operations could also delete production volumes, we would never have stored it.

> Because Railway stores volume-level backups in the same volume — a fact buried in their own documentation that says "wiping a volume deletes all backups" — those went with it.

I don't like the wording where it's the Railway CLI fault that didn't give a warning about the scope of the created token. Yes, that would be better but it didn't make the token a person did and saved it to an accessible file.

fsh•43m ago

I find these posts hilarious. LLMs are ultimately story generators, and "oops, I DROP'ed our production database" is a common and compelling story. No wonder LLM agents occasionally do this.

einrealist•38m ago

Also funny how people (including LLM vendors, like Cursor) think that rules in a system prompt (or custom rules) are real safety measures.

beej71•11m ago

Like we say in adventure motorcycling: "It's never the stuff that goes right that makes the best stories." :)

Mashimo•42m ago

Oh wow, what a character. 3 month old offsite backup, but he is not to blame.

> "Believe in growth mindset, grit, and perseverance"

And creator of a Conservative dating app that uses AI generated pictures of Girls in bikini and cowboy hat for advertisement. And AI generated text like "Rove isn’t reinventing dating — it’s remembering it." :S

qnleigh•30m ago

It seems like the most unreasonable thing happening here is Railway's backup model and lack of scoped tokens. On the agent side of things, how would one prevent this, short of manually approving all terminal commands? I still do this, but most people who use agents would probably consider this arcane.

(Let's suppose the agent did need an API token to e.g. read data).

comrade1234•9m ago

Some of this stuff is so embarrassing. Why would you even post this online?

Rewrote My Blog with Zine

How to build expertise while using Claude Code

A Polish Influencer Beat MrBeast's Charity Guinness Record

Ask HN: Which Is Better–Android or iOS?

Conspiracy Theories Are Everywhere Following WH Correspondents' Dinner Shooting

Show HN: WaveletLM – wavelet-based, attention-free model with O(n log n) scaling

Taking Credit for the Idea

Maine's governor vetoes data center moratorium

Sabastian Sawe smashes two-hour barrier in marathon's Roger Bannister moment

Removing the AUICGP instruction from CHERIoT RISC V

Does AI still feel like too much work to you?

AI made writing code fast. Understanding it is still slow

FLUX.2 Klein – How Inference Works

The disappearing AI middle class

Palantir's Alex Karp: Technological Republic, in Brief

OpenCode-power-pack – Claude Code skills ported to OpenCode

Interaction Nets and Hardware

AI doom warnings are getting louder. Are they realistic?

Show HN: realistic_blas: Exact infinite-precision LA, useful errs, f64 fast path

SensibleJS – Reactive UI in ~10KB with Plain HTML Attributes

Show HN: iOS app that visualizes your brainrot

If more than 50% press blue, everyone survives. Red pressers always survive

My brave new code-signing world

The Beijing Auto Show is a glimpse at the future of the auto industry

Primus Projection: Estimate Memory and Performance Before You Train

2026 Japan Stationary Award Winners [video]

Show HN: pg_savior: a seatbelt for Postgres – blocks accidental DELETE/UPDATE

Namastex.ai NPM Packages Hit with TeamPCP-Style CanisterWorm Malware

Orwell: You and the Atom Bomb

I built a Wayland window manager you can extend with WebAssembly