Key design decisions:
- Hypothesis branching/pruning instead of broad exploration. Forms 3-5 hypotheses, tests each with causal queries, prunes dead ends, branches deeper on strong evidence (max depth: 4).
- Every mutation requires human approval. Full audit trail of what the agent thought, what it queried, and why.
- Pulls context from your runbooks, postmortems, and architecture docs (Confluence, Notion, Google Drive, or local markdown). The agent doesn't investigate in a vacuum.
- Deep Claude Code integration — auto-injects relevant operational context into coding sessions via MCP.
Try the demo without any API keys:
npx @runbook-agent/runbook demoGitHub: https://github.com/Runbook-Agent/RunbookAI
Would love to hear any questions/feedback!