It's not really "assigning blame", it's more like "acknowledging limitations of the tools."
Giving an LLM or "agent" access to your production servers or database is unwise, to say the least.
he's not going to be happy with all this publicity
> But how could anyone on planet earth use it in production if it ignores all orders and deletes your database?
Someday we'll figure out how to program computers deterministically. But, alas.
https://twitter-thread.com/t/1946239068691665187
This wasn't even the first time "code freeze" had failed. The system did them the courtesy of groaning and creaking before collapsing.
Develop an intuition about the systems you're building, don't outsource everything to AI. I've said before, unless it's the LLM who's responsible for the system and the LLM's reputation at stake, you should understand what you're deploying. An LLM with the potential to destroy your system violating a "code freeze" should cause you to change pants.
Credit where it is do, they did ignore the LLM telling them recovery was impossible and did recover their database. And eventually (day 10), they did accept that "code freeze" wasn't a realistic expectation. Their eventual solution was to isolate the agent on a copy of the database that's safe to delete.
Don't run foreign code from the Internet -> we got LLMs
The AI responses are very suspicious. LLMs are extremely eager to please and I'm sure Replit system prompts them to err on the side of caution. I can't see what sequence of events could possibly lead any modern model to "accidentally" delete the entire DB.
Is this a hoax for attention? It's possible, but the scenario is plausible, so I don't see reason to doubt it. Should I receive information indicating it's a hoax, I'll reassess.
To immediately turn around and try to bully the LLM the same way you would bully a human shows what kind of character this person has too. Of course the LLM is going to agree with you and accept blame, they’re literally trained to do that.
It can only make accidental complexity grow and people's understanding diminish.
When the inevitable problems become apparent, and you claim people should have understood better. Maybe using the tool that let's you avoid understanding things was a bad idea...
A manager hiring a team of real humans, vs. a manager hiring an AI, either way the manager doesn't know or learn how the system works.
And asking doesn't help, you can ask both humans and AI, and they'll be different in their strengths and weaknesses in those answers, but they'll both have them — the humans' answers come with their own inferential distance and that can be hard to bridge.
Humans make mistakes, and they are critical too (crowdstrike), but letting machines decide, and build, and everything, just let humans out of the processes, and with the current state of "AI", thats just dumb.
I agree that AI have risks specifically because of memetic monoculture, in that while they can come from many different providers, and each instance even from the same provider can be asked to role-play in many different approaches to combine multiple viewpoints, they're all still pretty similar. But the counter point there is that while multiple different humans working together can sometimes avoid this, we absolutely also get group-think and other political dynamics that make us more alike than we ideally would be.
Also you're comparing a group humans vs. one AI. I meant one human vs one AI.
https://www.semafor.com/article/01/15/2025/replit-ceo-on-ai-...
You get what you ask for. You can't blame non-professionals to not act like professionals.
Has to be a joke. Right?
Here’s another funny one: https://aicodinghorrors.com/ai-went-straight-for-rm-rf-cmb5b...
It's like driving assistants, they feel like they can manage but in the end you are responsible.
The second theory is an unbounded or inadequately bounded delete statement - essentially deleteMany on a single table.
From a more technical org I'd be interested in a write-up, but my intuition says one of those two paths to deleting technically a single table.
consumer451•6mo ago
However, we are nowhere near the reliability of these tools to be able to:
1. Connect an MCP to a production database
2. Use database MCPs without a --read-only flag set, even on non-prod DBs
3. Doing any LLM based dev on prod/main. This obviously also applies to humans.
It's crazy to me that basic workflows like this are not enforced by all these LLM tools as they will save our mutual bacon. Are there any tools that do enforce using these concepts?
It feels like decision makers at these orgs are high on their own marketing, and are not putting necessary guardrails on their own tools.
Edit: Wait, even if we had AGI, wouldn't we still need things like feature branches and preview servers? Maybe the issue is that these are just crappy early tools missing a ton of features, and nothing to do with the reliability and power of LLMs?
avbanks•6mo ago
Cthulhu_•6mo ago
Then again, this reminds me of the prompts in operating systems whenever something needs root access, most people just blindly okayed it, especially on Windows since Vista did too many of them even for trivial operations.
hmijail•6mo ago