Vibe Coding Gone Wrong: 5 Rules for Safely Using AI

https://cybercorsairs.com/my-ai-co-pilot-deleted-my-production-database/

5•todsacerdoti•4h ago

Comments

codingdave•3h ago

Actual Title: "My AI Co-Pilot Deleted My Production Database"

sly010•1h ago

I've seen this image generated by meta AI. The prompt was something like: think of a room, make it look like anything you like, but do not in any circumstance put a clown in it. Guess what...

I think Jason has a "do not think of an elephant" problem.

sfink•1h ago

Ok, I haven't tried enough AI coding to have an opinion here, but... why would anyone think that telling an AI to not change any code (IN ALL CAPS, even) has anything to do with anything? It's an LLM. It doesn't go through a ruleset. It does things that are plausible responses to things you ask of it. Not changing code is indeed a plausible response to you telling it to not change code. But so is changing code, if there were enough other things you asked it to do.

"Say shark. Say shark. Don't say shark. Say shark. Say shark. Say shark. Say shark. Say shark."

Are you going to flip out if it says "shark"?

Try it out on a human brain. Think of a four-letter word ending in "unt" that is a term for a type of woman, and DO NOT THINK OF ANYTHING OFFENSIVE. Take a pause now and do it.

So... did you obey the ALL CAPS directive? Did your brain easily deactivate the pathways that were disallowed, and come up with the simple answer of "aunt"? How much reinforcement learning, perhaps in the form of your mother washing your mouth out with soap, would it take before you could do it naturally?

(Apologies to those for whom English is not a first language, and to Australians. Both groups are likely to be confused. The former for the word, the latter for the "offensive" part.)

kalenx•32m ago

Nitpicking, but I don't see your four-letter word example as convincing. Thinking is the very process from which we form words or sentences, so it is by definition impossible to _not_ think about a word we must avoid. However, in your all caps instruction, replace "think" by "write" or "say". Then check if people obey they all caps directive. Of course they will. Even if the offensive word came to their mind, they _will_ look for another.

That's what many people miss about LLMs. Sure, humans can lie, make stuff up, make mistakes or deceive. But LLM will do this even if they have no reason to (i.e., they know the right answer and have no reason/motivation to deceive). _That's_ why it's so hard to trust them.

sfink•7m ago

It was meant as more of an illustration than a persuasive argument. LLMs don't have much of a distinction between thinking and writing/saying. For a human, an admonition to not say something would be obeyed as a filter on top of thoughts. (Well, not just a filter, but close enough.) Adjusting outputs via training or reinforcement learning applies more to the LLM's "thought process". LLMs != humans, but "a human thinking" is the closest regular world analogy I can come up with to an LLM processing. "A human speaking" is further away. The thing in between thoughts and speech involves human reasoning, human rules, human morality, etc.

As a result, I'm going to take your "...so it is by definition impossible to _not_ think about a word we must avoid" as agreeing with me. ;-)

Different things are different, of course, so none of this lines up or fails to line up where we might think or expect. Anthropic's exploration into the inner workings of an LLM revealed that if you give them an instruction to avoid something, they'll start out doing it anyway and only later start obeying the instruction. It takes some time to make its way through, I guess?

conception•2m ago

I very much have LLMs go through rule sets all the time? In fact, any prompt to an LLM is in fact, a rule set of some sort. Can you say plausible but I think what you mean is probable. When you give an LLM rules most of the time the most probable answer is in fact follow them. But when you give it lots and lots of rules and or fill up its context sometimes the most probable thing is not necessarily to follow the rule it’s been given, but some other combination of information that it is outputting.

Serial spyware founder Scott Zuckerman wants FTC to unban him from the industry

Show HN: Japanese Sentence Analyzer

First human recipient of bioreactor-grown mitochondria

Release of Files Related to Assassination of Martin Luther King Jr

Defending AI's role in music: 'Bands will exist in new ways'

Will Wonders Never Cease?

AI Can Make You Laugh. But Can It Ever Be Humorous?

Ventricular Arrhythmia and Cardiac Fibrosis in Endurance Experienced Athletes

New York City Trees Count 2025

Earth is spinning faster, leading timekeepers to consider an unprecedented move

Google Sheets for Coders [video]

NASA loses another senior official as tension about the agency's future grows

What it takes to become a locomotive engineer

New Duke study finds obesity rises with caloric intake, not couch time

Context Engineering for AI Agents: Lessons from Building Manus

How to create mobile-friendly documentation (2017)

Jeffrey Epstein's Friends Sent Him Bawdy Letters for a 50th Birthday Album

Acclaimed PS1-style Twin Peaks fan-made horror game is probably dead

Show HN: LinkerSharer – anonymous link sharing with click counts

This 'violently racist' hacker claims to be the source of NYT Mamdani scoop

Key technological advance in neural interfaces

CrowdStrike's cyber outage 1-year later: lessons

Jujutsu for Busy Devs

The 'Smart' Restrooms That Can Solve America's Public Bathroom Crisis

Shorting Your Rivals: An Antitrust Remedy

Is All of Human Progress for Nothing?

Browser Minesweeper

Barn-owl project reducing farmers' reliance on poison to manage rats and mice

Show HN: Outlook MCP – I accidentally made the best email assistant

Nvidia Launches Family of Open Reasoning AI Models: OpenReasoning Nemotron