The distinction between wish list and contract is real. CLAUDE.md tells the agent what you want but can't enforce it. The enforcement has to happen downstream - hooks for deterministic rules (formatting, test runs, linting), but risk scoring for the non-deterministic stuff (did it introduce a security vulnerability? did it touch auth middleware without tests?). The model will cheerfully ignore 80% of your CLAUDE.md when the context window fills up. The only reliable contract is one that evaluates the output, not one that constrains the input.
mergeshield•1h ago