Then I fixed the constraints. Eight days, zero failures, zero intervention.
The secret wasn't better prompts... it was treating the LLM as a constrained function: schema-validated tool calls that reject malformed output and force retries, two-pass architecture separating editorial judgment from formatting, and boring DevOps (retry logic, rate limiting, structured logging).
The Claude invocation is ~30 lines in a 2000-line system. Most of the work is everything around it.
https://seanfloyd.dev/blog/llm-reliability https://github.com/SeanLF/claude-rss-news-digest