Oh, but maybe allowing it to do remote git operations is a necessary trigger.
Some people are upset at my brave new world characterization, but yeah even as someone deriving value from Claude Code we've jumped the shark on AI in development.
Either the industry will face that reality and recalibrate, or in 20 years we're going to look back on these days like the golden age of software reliability and just accept that software is significantly more broken than it was (we've been priming ourselves for that after all)
I think we’ve seen a wave of bad actors - either employees of LLM companies, or bots - pushing the idea hard of code quality not mattering and “the models will improve so fast that your code quality degrading doesn’t matter”.
I think the humans pushing that idea may even believe it, but I don’t think they’re usually employed as software engineers at regular non-AI companies, rather they have some incentive to believe it and convince others as well
It's tending more and more towards pushing the user to treat the whole thing as a pure chat interface magic black box, instead of a rich dashboard that allows you to keep precise track of what's going on and giving you affordances to intervene. So less a tool view and more magic agent, where the user is not supposed to even think about what the thing is even doing. Just trust the process. If you want to know what it did, just ask it. If you want to know if it deleted all the files, just ask it in the chat. Or don't. Caring about files is old school. Just care about the chat messages it sends you.
You reap what you sow, finance bro.
The model is probabilistic and sequences like `git reset --hard` are very common in training data, so they have some probability to appear in outputs.
Whether such a command is appropriate depends on context that is not fully observable to the system, like whether a repository or changes are disposable or not. Because of that, the system cannot rely purely on fixed rules and has to figure intent from incomplete information, which is also probabilistic.
With so many layers of probabilities, it seems expected that sometimes commands like this will be produced even if they are not appropriate in that specific situation.
Even a 0.01% failure rate due to context corruption, misinterpretation of intent, or guardrail errors would show up regularly at scale, that is like 1 in 10000 queries.
> I guess, what I'm trying to say ... is this even a bug? Sounds like the model is doing exactly what it is designed to do.
False, it goes against the RL/HF and other post training goals.
That's not what I said at all. I never said it will be produced. I said there is some probability of it being produced.
> False, it goes against the RL/HF and other post training goals.
It is correct that frequency in training data alone does not determine outputs, and that post-training (RLHF, policies, etc.) is meant to steer the model away from undesirable behavior.
But those mechanisms do not make such outputs impossible. They just make them less likely. The underlying system is still probabilistic and operating with incomplete context.
I am not sure how you can be so confident that a probabilistic model would never produce `git reset --hard`. There is nothing inherent in how LLMs work that makes that sequence impossible to generate.
I don't know how that refutes what I'm saying.
The behaviour was reproduced multiple times, so it is clearly an observable outcome, not a one-off. It just shows that the probability of `git reset --hard` is > 0 even with RLHF and post-training.
My point is that fixing one bug does not eliminate the class of bugs. Heck, it does not even fix that one bug deterministically. You only reduce its probability like you rightly said.
With git commands, there is not like a system like Lean that can formally reject invalid proofs. Really I think the mathematicians have got it easier with LLMs because a proof is either valid or invalid. It's not so clear cut with git commands. Almost any command can be valid in some narrow context, which makes it much harder to reject undesirable outputs entirely.
Until the underlying probabilities of undesirable output become negligible so much that they become practically impossible, these kinds of issues will keep surfacing even if you address individual bugs. Will the probabilities become so low someday that these issues are practically impossible? Maybe. But we are not there yet. Until then, we should recalibrate our expectations and rely on deterministic safeguards outside the LLM.
You can reduce the risk, but not drive it to zero, and at scale even very small failure rates will surface.
1. if the problem the post is suggesting is common enough, it is a bug and the extent needs to reduce (as you said)
2. if it is not common and it happens only for this user, it is not a bug and should be mostly ignored
Point is: the system is not something that is inherently a certain way that makes it unusable.
What if it happens for two users? (Still "not common").
Never had that experience in the whole time using cursor at work so I had to "take the agent to task" and ask it "WTF-mate? you'd better be able to repro that!" and then circle around the drain for a while getting an AGENTS.md written up. Not really a big deal, as the whole project was like 1k lines in and it's not like the code I'd hand-written there was "irreplaceable" but it lead to some interesting discussion w/ the AI like "Why should I have to tell you this? Shouldn't your baseline training data presume not to delete files that you didn't author? How do you think this affects my trust not just of this agent session, but all agent interactions in the future?"
Overall, this is turning out to be quite interesting technology times we're living in.
1) claude will stash (despite clear instructions never to do so).
2) claude will use sed to bulk replace (despite clear instructions never to do so). sed replacements make a mess and replaces far too many files.
3) claude restores the stash. Finds a lot of conflicts. Nothing runs.
4) claude decides it can't fix the problem and does a reset hard.
I have this right at the top of my CLAUDE.md and it makes things better, but unlike codex, claude doesn't follow it to the letter. However, it has become a lot better now.
NEVER USE sed TO BULK REPLACE.
*NEVER USE FORCE PUSH OR DESTRUCTIVE GIT OPERATIONS*: `git push --force`, `git push --force-with-lease`, `git reset --hard`, `git clean -fd`, or any other destructive git operations are ABSOLUTELY FORBIDDEN. Use `git revert` to undo changes instead.
In your own example you have all this huge emphasis on the negatives, and then the positive is a tiny un-emphasized afterthought.
(more loosely: I'm a big proponent of this too, but it's a helluva hot take, how one positively frames "don't blow away the effing repro" isn't intuitive at all)
This is only restricted for *fully free* accounts, but this feature only requires a minimum of a paid Pro account. That starts around $4 USD/month, which sounds worth it to prevent lost work from a runaway tool.
Its trivial to setup and you could literally ask claude to do it for you and never have any of these issues ever again.
Any and all "I don't want it to ever run this command" issues are just skill issues.
Just setup a hook that prevents any git commands you don't ever want it to run and you will never have this happen again.
Whenever I see stuff like this I just wonder if any of these people were ever engineers before AI, because the entire point of software engineering for decades was to make processes as deterministic and repeatable as possible.
I just checked, mine also doesn‘t.
do not share a workspace with the llm, or with anybody for that matter.
How would the llm even distinguish what was wrote by them and what was written by you ?
I don’t think this is a valid way of checking for spawned processes. Git commands are fast. 0.1-second intervals are not enough. I would replace the git on the $PATH by a wrapper that logs all operations and then execs the real git.
Maybe even submitting the bug report "agentically" without user input, if it's running on host without guardrails (pure speculation).
(No need to use bpftrace, just an easy example :-) )
This whole LLM thing is a blast, huh?
Now I wish I could reject `git reset --hard` on my local system somehow.
triple hyphens —
BoorishBears•1h ago
-
I guess some people are upset at my brave new world characterization, but even as someone deriving value from Claude Code we've jumped the shark on AI in development.
The idea a natural request can get Claude to invoke potentially destructive actions on a timer is silly
https://code.claude.com/docs/en/scheduled-tasks#set-a-one-ti...
What would it cost if the /loop command was required instead of optional?