LLMs are great at creating plans but terrible at following them. I've seen agents claim to create 5 files but only make 2, repeat API calls 3x, skip error handling, then report success anyway. The fix: treat execution like todo management—track every step, block the agent if it tries tools not in the current step, and verify completion (don't trust its word, actually check if the file exists). This plus guardrails and git-like versioning improved the reliability siginificantly
verdverm•3mo ago
seems reasonable and resonates with the approach I plan to take when I start building my agent
sunir•3mo ago
If the plan is too big to fit into context or requires too much attention it overwhelms the llm. You need to decompose into tasks and todos aggressively.
mayankd•3mo ago
For sure, agents tend to bang their heads against a wall, and can deviate in surprising ways to attempt to escape that wall. Balancing the scope of a plan and making agents stick to it is a tricky balance to strike
anup_sia•3mo ago
verdverm•3mo ago