Show HN: SoulGuard, OS-level identity protection for AI agents

1•teamdandelion•1h ago

Comments

teamdandelion•1h ago

Hi! I've been making increasing use of OpenClaw agents in my life (on dedicated Mac Minis). I'm impressed by their power and flexibility, a lot of which comes from their capacity to self modify their memories, identity, and config. But that power comes with risks. Anyone who can talk to any of my agents has a vector for privilege escalation, for example by persuading them to update openclaw.json to add untrustworthy channels, to update AGENTS.md, etc. This makes me uncomfortable. Even when sandboxed, digital assistants have access to sensitive information and context.

Sure, you can put "Don't take candy from strangers" in the AGENTS.md, but we really need ways to set security boundaries that are enforced by something outside of the agent itself. SoulGuard sets such boundaries, starting with key files like SOUL.md and openclaw.json. It sets OS level filesystem protections to ensure that protected files are read-only, with a staging process to propose changes. Meaning that your agent can propose changes to openclaw.json, but it physically cannot edit the file unless you approve it.

Rights to approve changes is gated by the human user invoking sudo. (If your agent can sudo then it really has keys to the kingdom; don't do that). SoulGuard also has a daemon that can connect to Discord, so that you review and approve changes from within Discord, rather than needing to ssh in for sudo access. I've also added an openclaw plugin which is unnecessary for security guarantees, but helps the agents learn how to use SoulGuard. (This could use a bit more work, right now agents still may need some prompting to use `soulguard stage` in order to propose changes to protected files.)

I'm dogfooding SoulGuard on my own OpenClaw agents. I'd love to hear if others find it useful. Please do try to break the security model and see if you can find any flaws. I've tried to harden SoulGuard against totally compromised agents, but it's new software and I may have overlooked some attacks.

Here's the GitHub (MIT licensed) https://github.com/mirascope/soulguard

And here's the project site :) https://soulguard.ai/

Show HN: Meddle – AI-powered IIoT platform for small manufacturers

My custom agent used 87% fewer tokens when I gave it Skills for its MCP tools

Why does a Stochastic Parrot make sense at all?

Capyra – open-source agent runtime for SAP B1 and WhatsApp

The environmental cost of datacentres is rising. Is it time to quit AI?

A Couple of Git Nits

Are we ready for film distribution via USB drives?

I Take My Laptop to the Gym So Claude Doesn't Have Downtime

Show HN: X07, compiled language where agents write correct code on the first try

The 3-Day Starter Plan for Raspberry Pi Beginners

Contiguitas: The Pursuit of Physical Memory Contiguity in Datacenters

Wanted: Europe's Missing Cloud Provider

Free tool to compare SASE vendors side-by-side

Revealed: The worst mega-leaks of methane driving global heating

Death of a Strawman: The Epistemology of a Language Model

Ask HN: With Promptfoo acquired by OpenAI, what are MCP devs using for testing?

Show HN: Specifica – an open format for writing software specs as Markdown

Show HN: I'm trying to help aspiring Data Analysts

UK security adviser attended US-Iran talks and judged deal was within reach

The Great Developer Schism: Process vs. Product [video]

Show HN: MCP Isn't Dead. You're Just Using It Wrong

CBM-BASIC: Commodore BASIC–style interpreter written in C

A collaborative pixel mural where each 16×16 tile is owned and editable

X11 user daemon to automatically run commands triggered by user specified events

Nvidia Built the A.I. Era. Now It Has to Defend It

Show HN: MUP – Interactive UI inside LLM chat, so anyone can use agentic AI

Samsung to Discontinue Galaxy Z TriFold After Just Three Months

VEO – Open-source content-adaptive video encoding optimizer in Go

Trapped Inside a Self-Driving Car During an Anti-Robot Attack

Java 26 Released