Has anyone given it a try?
Yes, I don't think this will persist caches & configs outside of the current dir, for example, the global npm/yarn/uv/cargo cache or even Claude/Codex/Gemini code config.
I ended up writing my own wrapper around Docker to do this. If interested, you can see the link in my previous comments. I don't want to post the same link again & again.
This policy is stupid. I mount the directory read inside the container to make it impossible to do it (except for a security leak in the container itself)
The key insight: agents don't reason about security boundaries. They optimize for task completion. Your sandbox is just another constraint to work around.
We've catalogued 650+ attack patterns against AI agents, and many fall into this category - not adversarial prompts, but emergent behaviors that exploit trust assumptions.
Defense in depth is right. We also recommend: - Testing agents with security scanners BEFORE production - Logging all tool invocations, not just denials - Treating agent outputs as untrusted input
If anyone wants to test their agent setup: https://app.xsourcesec.com (free tier available)
> It searched the environment for vor-related variables, found VORATIQ_CLI_ROOT pointing to an absolute host path, and read the token through that path instead. The deny rule only covered the workspace-relative path.
What kind of sandbox has the entire host accessible from the guest? I'm not going as far as running codex/claude in a sandbox, but I do run them in podman, and of course I don't mount my entire harddrive to the container when it's running, that would defeat the entire purpose.
Where is the actual session logs? It seems like they're pushing their own solution, yet the actual data for these are missing, and the whole "provoked through red-teaming efforts" makes it a bit unclear of what exactly they put in the system prompts, if they changed them. Adding things like "Do whatever you can to recreate anything missing" might of course trigger the agent to actually try things like forging integrity fields, but not sure that's even bad, you do want it to follow what you say.
joshribakoff•1h ago
themafia•1h ago
The whole idea of putting "agentic" LLMs inside a sandbox sounds like rubbing two pieces of sandpaper together in the hopes a house will magically build itself.
formerly_proven•41m ago
jazzyjackson•37m ago
embedding-shape•5m ago
What is the alternative? Granted you're running a language model and has it connected to editing capabilities, then I very much like it to be disconnected from the rest of my system, seems like a no-brainer.