The core insight: LLMs don't reliably follow instructions, but you can catch failures cheaply and retry with targeted feedback. This is essentially a lightweight "process reward model" that requires zero training.
How it works: 1. Your LLM generates output 2. ai-assert checks it against constraints (length, word count, sentence count, regex, custom predicates) 3. Each constraint returns a score in [0,1] -- composite is multiplicative (zero in any = zero overall) 4. If score < threshold, retry with feedback ("Constraint X failed because Y -- regenerate") 5. Return the best-scoring attempt
On IFEval (25 instruction-following constraint types): 69.3% -> 76.2% accuracy.
278 lines. Zero dependencies. Works with any callable that takes a string and returns a string.
pip install ai-assert