BetterClaw takes a different angle: you describe the workflow you want in plain English ("Diagnose the credential mismatch - read the config, test the connection, report findings — do not modify or delete anything"), and the CLI compiles that paragraph into a directed graph of nodes, where each node declares which tools are allowed at that step. A plugin hooks into your agent's tool-call path and blocks anything outside the graph before it dispatches to the MCP server.
So in the PocketOS reproducer (included in the repo with a mock Railway server, so you can run it without an account): the agent tries railway_delete_volume mid-conversation, the hook returns a deviation error, the volume is never touched.
What I'd love feedback on:
- Is "paragraph -> graph" the right authoring model, or should this be YAML / a DSL?
- Where does this fall down for you? (Multi-step approvals? Loops? Sub-agents?)
- What other agent runtimes should we support beyond Claude Code + Cowork + OpenClaw?
Repo: https://github.com/jfan22/BetterClawDemo (90s): https://youtu.be/ZreUtANHET0?si=VpdjA6lf0Wa1mhoi
Install: npm install -g @betterclaw-ai/cli
Apache 2.0.