If you make tests that check all the failure cases and have claude run the tests before declaring it's done, the amount of yelling at it goes down and success goes up. When you can't make formal tests for something, that is when it's really frustrating.
ctoth•1d ago
You'd think!
I'm here to tell you though that for especially complex plans where you say "absolutely do this every time!" or "you must test and commit! or whatever...
Well.
There's a reason that most people who love Claude Code are thinking about orchestration now. Personally my version is called Godshatter, and it's trying to apply behavior trees to coding agents.
Here's a plan that failed on me with default Claude Code last night (note the gates and explicit test callouts):
I've personally been noticing this thing that Kelsey is talking about here and I don't like it one little bit. A computer should not be able to increase my blood pressure, and I should never get angry at the damn thing, and I especially don't like what this says about ... well, the whole treating your servants well thing. I'm happy at least somebody is writing about this, it's a lot more interesting of a discussion than "are these things actually useful?"
tim-tday•22h ago
I am pretty even keeled about my interactions with ai coding assistants. But yesterday I lost it. The idiot did something inexcusably stupid. The last thing I said was “you’re fired” I haven’t looked at it since and I’ll be working without it for the foreseeable future.
boxed•1d ago
ctoth•1d ago
I'm here to tell you though that for especially complex plans where you say "absolutely do this every time!" or "you must test and commit! or whatever... Well.
There's a reason that most people who love Claude Code are thinking about orchestration now. Personally my version is called Godshatter, and it's trying to apply behavior trees to coding agents.
Here's a plan that failed on me with default Claude Code last night (note the gates and explicit test callouts):
https://gist.github.com/ctoth/860cf487a48a526dc7483daae51c35...
boxed•10h ago