We’ve been integrating AI coding agents deeply into our development workflows and saw major gains once we let agents run in long loops (tests → build → search → refine). The problem: to be effective, agents need real access to the environment.
Our first approach was allowing agents to run bash. It works, but it’s obviously dangerous — arbitrary commands, no guardrails, and a lot of implicit trust.
OpenCuff is our attempt to fix this.
It’s an open-source tool that gives AI coding agents explicit, capability-based access instead of raw shell execution. Agents can run useful actions (tests, builds, searches, etc.) while staying within well-defined, auditable boundaries. The goal is to let agents run autonomously for hours without turning your machine or CI environment into a free-for-all.
This is an early, fast-moving project. We built it for ourselves, but we think others running AI agents at scale may run into the same tradeoffs.
We’d really appreciate feedback from the HN community — especially around:
threat models we may be missing
comparisons to sandboxing / containers / policy-based systems
kfirg•1h ago
We’ve been integrating AI coding agents deeply into our development workflows and saw major gains once we let agents run in long loops (tests → build → search → refine). The problem: to be effective, agents need real access to the environment.
Our first approach was allowing agents to run bash. It works, but it’s obviously dangerous — arbitrary commands, no guardrails, and a lot of implicit trust.
OpenCuff is our attempt to fix this.
It’s an open-source tool that gives AI coding agents explicit, capability-based access instead of raw shell execution. Agents can run useful actions (tests, builds, searches, etc.) while staying within well-defined, auditable boundaries. The goal is to let agents run autonomously for hours without turning your machine or CI environment into a free-for-all.
This is an early, fast-moving project. We built it for ourselves, but we think others running AI agents at scale may run into the same tradeoffs.
We’d really appreciate feedback from the HN community — especially around:
threat models we may be missing
comparisons to sandboxing / containers / policy-based systems
what you’d want from a tool like this
Links: https://opencuff.ai
https://github.com/OpenCuff/OpenCuff
Happy to answer questions and discuss design decisions.