- sycophancy, I'm honestly tired of "You're absolutely right". I need a pair programmer, something that's gonna correct me, provide different ideas, etc.
- inability to follow the script. Even though it will tell you you're right, it will still do its thing. Doesn't matter if I spend 2 hours writing a detailed spec file, a todo list, etc, Claude's gonna do its thing regardless. You can even correct it with "no, don't do this", and it will still do it regardless. I understand that this is how AI works (it's like children, if you tell them to not do something it's more likely they will), but it's just unbearable.
For both of these things it's impossible to make it go right. No matter the system instructions, the prompts, the context management, it's just terrible.
That's not to say it's all bad: there are things I like about Claude and AI assistants. I firmly believe that a coder with AI is much more productive than one without. But what AI should be delegated to, is not writing and editing code, but planning it, writing specs, doing research, verifying you're maintaining docs, suggest ideas, alternatives, test cases, reviewing PRs according to guidelines, etc.
I don't even think it's a matter of "it will get better", it produces way too much code than a human can review and reviewing code is more difficult than writing it in the first place.
Even more, it can provide its value in tasks humans are just bad at such as writing good issues/tasks, stuff like user stories that use consistent ubiquitous language, etc, etc. Stuff that it's hard to get stakeholders to get right, but can be achieve with a set of good rules and having the stakeholders interact with the chatbot first that can guide them writing much more clearly.
But still not as infuriating as the second. And it can be really hard to stop it from doing something you don't want.
I find that one can use Claude to produce lower quality code faster, but one can also use it to produce higher quality code slower, by using it as a pair programmer, rubber duck, to try experiments, et cetera.
I have no clue how to avoid it going off rails, it's one of the most common criticisms I see on Reddit too.
> I find that one can use Claude to produce lower quality code faster, but one can also use it to produce higher quality code slower, by using it as a pair programmer, rubber duck, to try experiments, et cetera.
That's a very good phrase I'm gonna steal.
Instructions go a long way. There probably needs to be a better LMM+prompt+loop at the top, the one you interface with, or one below that maybe.
My next step is taking over this instead of outsourcing to M$, Google, or Anthropic. It's just too important to let others decide how they should work at this point. It needs to be more open and something we can tinker with like vim
Etheryte•4h ago
I'm not really convinced that this warrants the title the post currently has. For one, I hadn't even heard of Vibe Kanban prior to this, and for two, the error bars on this must be insanely wide as is.