I made Proof Loop fairly light, intentionally. It’s basically a protocol helper script for AI agent tasks:
- set acceptance criteria before coding/implementation - keep the builder and verifier roles separate - each criteria tested with results PASS, FAIL or UNKNOWN - attach evidence of done - keep the proof evidence in the repo, so that the next agent / run can inspect it and see what was already done
You can try it via commandline from the cloned repo, go the the proof-loop directory and run make demo.
Teh demo creates a task, checks the proof bundle, fails if evidence is missing, then passes when acceptance criteria have evidence attached.
There is also an OpenClaw skill version now, so the easiest use is
openclaw skills install proof-loop
In the GitHub repo, there is harness-agnostic version and examples.
I would especially like criticism and/or any feedback from people who run Codex, Claude Code or OpenCode on long-running multi-step tasks.
Note this is a utility that I use myself, FoC, MIT Licensed, OpenSourced, with no intention of any commercialization.
crionuke•38m ago
LeoStehlik•12m ago