Frank: a conservative agent. Verification returns VALID. Phil: an aggressive agent with tampered evidence. Verification returns INVALID and points to the exact line where the chain breaks.
The problem I was solving: when an AI agent does something unexpected in production, the post-mortem usually comes down to "trust our logs." I wanted evidence that could cross trust boundaries — from engineering to security, compliance, or regulators — without asking anyone to trust a dashboard.
How it works:
- Every action, policy decision, and state transition is recorded into a hash-chained NDJSON event log - Logs are sealed into evidence packs (ZIP) with manifests and signatures - A verifier (also in the demo) validates integrity offline and returns VALID / INVALID / PARTIAL with machine-readable reason codes - The same inputs always produce the same artifacts — so diffs are meaningful and replay is deterministic
The verifier and the UI are deliberately separated. The UI can be wrong. The verifier will still accept or reject based on cryptographic proof.
Built this before the recent public incidents around autonomous agents made it topical. Happy to answer questions about the architecture, the proof boundary design, or the gaps I'm still working on.