frontpage.

In February, Summer Yue, Meta's director of AI alignment, posted about her OpenClaw agent deleting 200+ emails after she'd told it to wait for approval. She screamed "STOP OPENCLAW" at it. It kept going.

The root cause: her constraint lived in the conversation. When the context compacted, it disappeared. The AI hadn't gone rogue; it had genuinely forgotten.

Zora's safety architecture is designed so that it can't happen. A few things that are different:

Compaction-proof rules. Policy lives in ~/.zora/policy.toml, loaded before every action, not in context. The LLM and the PolicyEngine don't share a channel.

Prompt injection defense. Every incoming message (Signal/Telegram) passes through a CaMeL dual-LLM quarantine, an isolated model with no tool access that extracts structured intent from raw text. The main agent never sees the original message.

Runtime safety layer. Every tool call is scored 0–100 for irreversibility before it executes. High-risk actions pause and route to your phone for approval via Signal or Telegram. A session risk forecaster tracks drift, salami slicing, and commitment creep across the whole session.

Locked by default. A misconfigured Zora does nothing. A misconfigured OpenClaw has full system access.

npm i -g zora-agent && zora-agent init

https://github.com/ryaker/zora

fp.

Show HN: Zora, AI agent with compaction-proof memory and a runtime safety layer