```
I have modified the type signature and behaviour of how jobs are created. Previously, job definition create took a batch argument (created from a queue). Now it takes the queue directly, is async, requires the databaseClient to be passed in at creation (vs. when the batch is executed). It no longer returns anything - which is fine because the result was only being used for logging - which is now done for us so we don't have to worry. Can we refactor the codebase to make use of the new JobDefinition.create? Remove the vestigial "Job created" log please.
Perform this task and this task only. If you see something unrelated that you believe needs to be refactored - DO NOT MODIFY IT. ONLY PERFORM ACTIONS DIRECTLY RELEVANT TO THIS TASK
```
So there are two instructions:
1. Do the task
2. Don't do stuff that isn't the task (added in frustration on subsequent attempts)
My experience:
The agent flow started well - it found all the files that needed to change and began making edits.
By about file #5 I noticed that on top of requested refactor it started re-ordering object keys of the `JobDefinition.create` method. Although semantically a no-op, this was incredibly frustrating as it made diffs much harder to review.
A little later, it started to modify log messages it wasn't happy with before eventually completely going off the rails and adding arguments to my function definitions that it _thought_ they needed (introducing type/run-time errors).
VSCode would periodically pause and ask for a confirmation in order to continue. Each time I used the opportunity to re-prompt the agent to stay on target:
Me: "STOP GOING OFF TASK - STOP RENAMING VARIABLES, REORDERING PARAMS. JUST DO AS THE TASK TELLS YOU AND NOTHING ELSE"
Agent: "You're absolutely right. I apologize for going off task. Let me focus solely on the task: refactoring JobDefinition.create calls to use the new signature and removing vestigial "Job created" logs"
And each time the bad behavior would return after some time.
I'm not sure what I'm doing wrong. I assumed this sort of mechanical monkey work would be bread and butter for an agentic workflow - but it just keeps losing coherence.
I ended up reverting all the changes as I had absolutely 0 trust in the quality of the generated code.
I apologise for the wall of text but I'm quite frustrated about all the time wasted and am desperate to know what I'm doing wrong!
Thanks in advance!
Mave83•1h ago
Of course add a CLAUDE.md, put clear development guidelines into it, let it verify the git changes he did against this guidelines and of course things like a lint.
It will go off rails, especially after compaction, but you can make it correct mistakes on it's own.