Show HN: I Built a Sandbox for Agents

https://github.com/vrn21/bouvet.com

31•vrn21•1w ago

Comments

bosky101•1w ago

The right link is https://github.com/vrn21/bouvet

nadis•1w ago

Thank you.

ripped_britches•1w ago

Why is it a problem to use containers?

vrn21•1w ago

every syscall on containers run on the kernal with full privelages, so if needed one can break out of the container and get access to the host

ripped_britches•1w ago

> with full privs

No that’s just a misconfigured container then.

Unless there is an exploit on an unpatched kernel bug, a properly configured container shouldn’t allow break out

_pdp_•1w ago

We use a service but it is always nice to have a free option if you need it. Good stuff.

vrn21•1w ago

using sandboxes makes a lot of sense now a days, but this is nowhere near the prod sandboxes the market has, they have a lot of work and optimizations going on! but yeah its little fun side project! thanks for the compliments :)

canadiantim•1w ago

This relies on the agent requesting a sandbox... which seems like the fox guarding the hen house, no?

vrn21•1w ago

tbh, this is kind of a gray area, i should have thought little bit more on how it should have been architectured,

for me this was the ideal scenario: a cloud model on web with its own sandbox (something like claude with read/write to files and run commands)

i dont really think this could be considered as a fox guarding the hen house, its not that ai wants to infect your computer with its commands, if ai is provided the mcp server, it will it instead of using the tools like bash in most cases [if in a local setup]. I feel its more of a lumberjack guarding his weapons.

monomial•1w ago

Is this a common pattern to have an agent request a sandbox? I feel like I'd want the whole agent running in it's own sandbox to begin with. Firecracker does look like a decent solution for that.

mccraveiro•1w ago

I agree. I'm testing https://sprites.dev/ because of that.

vrn21•1w ago

When I started to design the system, I thought of creating a way for an agent on the cloud to have access to a filesystem, such that they can read, write files and run commands. I can't really say that the startups in the space's main source of income is this, most of them rely on sdks for other platforms. I could adjust the core to work as a sdk as well, but right now the main interface is just a mcp server that a client can use

cr125rider•1w ago

Is firecracker instead of a docker container worth the hassle?

kernc•4d ago

Probably not. Maybe Bubblewrap and sandbox-run. It's an anything-is-already-way-better-than-nothing type of thing.

[0]: https://github.com/containers/bubblewrap

[1]: https://github.com/sandbox-utils/sandbox-run

sahiljagtapyc•1w ago

interesting

debarshri•1w ago

Can someone elaborate with whats wrong with having containers for sandbox?

binsquare•1w ago

It's because containers share the kernel with the host. Generally it's just not considered a security boundary. (Note that containers have come a longer way in the security side btw)

So it's a mostly security thing.

debarshri•1w ago

But in the context of agents. Does it matter?

tptacek•1w ago

Depends. Probably not usually. I've thought about this a bunch and I think the serious "threat" here isn't the agent acting maliciously --- though agents will break out of non-hardened sandboxes! --- but rather them exposing some vulnerability that an actual human attacker exploits.

buu700•1w ago

I'd also add that I just don't like the idea in principle that I should have to trust the agent not to act maliciously. If an agent can run rm -rf / in an extreme edge case, theoretically it could also execute a container escape.

Maybe vanishingly unlikely in practice, but it costs me almost nothing to use a VM just in case. It's not impossible that certain models turn out to be poorly behaved, that attackers successfully execute indirect prompt injection via malicious tutorials targeting coding agents, or that some shadowy figure runs a plausibly deniable attack against me through an LLM API.

debarshri•1w ago

This is a genuine concern. But this sounds a bit independent of the execution environment. It could either be containers or VMs.

tptacek•1w ago

On a local machine, yeah, I think it's pretty situational. VMs are safer, but in risk management terms the win is sometimes not that significant.

In a multitenant cloud environment, of course, totally different story.

DeborahEmeni_•5d ago

I’ve been experimenting with this recently. Running services inside microVMs instead of plain containers makes the threat model easier to reason about, especially for multi-tenant or untrusted workloads. I’ve been trying it out on Northflank and the trade-offs become pretty obvious.

aghilmort•1w ago

security matters if want to demarc where agents can play. running agent inside of strong VM is usually where starts container not enough for that full isolation only sees files you want it to etc

binsquare•1w ago

Imo it's even more important in context of agents, if these agents are as good as it's going to get with as much access as we let them.

starlust2•1w ago

One could theoretically use a prompt injection attack to exploit a privilege escalation vulnerability on the kernel.

ATechGuy•1w ago

What about VMs? They offer strong isolation, as they don't share kernels, and have long been a foundational piece for multi-tenant computing. Then, why would we put an extra layer on top and rebrand it as an AI agent sandboxing solution? I'm genuinely curious what pushes everyone to build their own and launch here Is it one of those tarpit ideas: driven by own need and easy to build?

Ronsenshi•1w ago

From what I read others say at some point on HN:

- resources

- security

- setup speed?

I suppose a lot depends on how and in what environment you're dealing with agents.

Resources might be an issue on Mac if you have bunch of agents running different things, trying to execute code in different containers. But that's the issue of Mac and the way containers are running in a VM there.

Security-wise there were concerns with prompt injection telling agent to execute certain steps to escape from container. Possible, but I'm not aware if there were actually cases of that.

vrn21•1w ago

Luis wrote an excellent blog about it btw: https://www.luiscardoso.dev/blog/sandboxes-for-ai

tomasphan•1w ago

Seems these thing pop up here ever so often. Either using firecracker or docker/containers. How is this different from the other sandboxes? BTW I love that you got LLM testimonials lol

binsquare•1w ago

I'm building an alternative to firecracker here if you're looking for something wayy different: https://github.com/smol-machines/smolvm

aghilmort•1w ago

we've considered docker, firecracker, will add smol to working roster

context <> building something with QEMU

* required has to support LMW+AI (linux/mac/windows + android/ios)

there are scenarios in which we might spin micro vms inside that main vm, which by default is almost always Debian Linux distro with high probability.

one scenario is say ETL vm and AI vm isolated for various things

curious why building another microVM other than sheer joy of building, what smol does better or different, why use smol, etc. (microVMs to avoid etc also fair game :)

jkelleyrtp•1w ago

I needed Mac / win/ Linux / iOS / android for dioxus dev, so I built my own in rust.

https://skyvm.dev/

binsquare•1w ago

I focus on different design decisions.

Smolvm is designed to run locally, persistent (stateful), long running (efficiency), and interactive.

Worked with firecracker and other options a lot btw, most of everything is designed for ephemeral serverless workloads.

aghilmort•1w ago

oh interesting our qemu use case is local!

binsquare•1w ago

Oh neat!

Feel feel to chat if you need anything, more user friendly docs are at smolmachines.com.

binsquare•1w ago

Cool option, I'm building in the same space. We should chat!

vrn21•1w ago

hi, you could find my socials at https://vrn21.com, feel free to dm :)

avaer•1w ago

Given that this is using Firecracker, is it Linux only?

vrn21•1w ago

yes, it is supposed to be hosted on bare metal linux machines; anyone on any machine can use it as a sandbox after adding the mcp server to the client

coip•1w ago

Anyone have any thoughts on this path if using macOS? Been using it, seems to do the trick pretty well out of the box.

https://developer.apple.com/documentation/Virtualization/run...

vrn21•1w ago

i think https://github.com/trycua/cua has some sort of it working

aghilmort•1w ago

interesting is the idea the agent calls it or just alt to terminal bash etc tool calls hey your tool calls are all microvms, containers, isoshells, raw term, clawd/molt all credentials with weaker and weaker security demarcs?

vrn21•1w ago

my ideal scenario is a cloud web model getting access to a sandbox to run commands and read/write to files. but yeah it could be used as an alternative to bash and read write tools.

I did not get your second question exactly, but yeah microvms can be considered one of the secure ways to run your agent

aghilmort•1w ago

Basically, just thinking that it’s more ideal to have the tool call the micro VM versus the agent, doing it in the sense of its mandated by the tool call

FEELmyAGI•1w ago

Great idea that is already implemented as a feature by major AI providers, several well funded startups, countless unfunded startups, and trivially solved per-user with any handful of existing technologies.

Truly baffling its in the top 5 of the front page. My first thought was bot army upvoting but the total points are quite low. That means this is some mod's personal idea of an especially interesting submission?

arscan•1w ago

Having testimonials attributed to Gemini 3 Pro and Claude 4.5 Opus is... interesting. I'm curious what prompt was used to get those quotes.

vrn21•1w ago

lol thanks for the compliments, generated both the testimonials after giving the mcp server to both opus and gemini and asked their feedback on it.

it is supposed to be directly used by agents, so they are kind of my end users, hence it made sense to get their testimonials :)

ATechGuy•1w ago

Congrats on launching, and great testimonials!

What problem does it solve compared to bazillion code execution sandboxing agents (and containers/VMs)?

Overall, a lot of people are building their own code execution sandboxing agents around containers/VMs. Curious to know what's missing that makes people DIY this?

Here's my list of code execution sandboxing agents launched in the last year alone:

1. E2B 2. AIO Sandbox 3. Sandboxer 4. AgentSphere 5. Yolobox 6. Exe.dev 7. yolo-cage 8. SkillFS ERA Jazzberry Computer Vibekit Daytona Modal Cognitora YepCode Run Compute CLI Fence Landrun Sprites pctx-sandbox pctx Sandbox Agent SDK Lima-devbox OpenServ Browser Agent Playground Flintlock Agent Quickstart Bouvet Sandbox Arrakis Cellmate (ceLLMate) AgentFence Tasker

vrn21•1w ago

Thanks for the compliments! I can't really say that it has a unique differentiator between all the other sandboxes out in the market, this was supposed to be a poc version on how i could be building a sandbox for agents, this had been haunting me since a few months, tried out what could happen!

the actual sandboxes in the market are doing a lot of work in optimizing the system end to end, and most of it is pretty hard. this project just scratches the surface

vrn21•1w ago

yes this is pretty busy market, but it is just pure infra/OS problem and many of it has already been solved in the past decades, right now its just who can fit everything together fastest

nadis•1w ago

Getting a 404 page not found for this project - how can I try it?

vrn21•1w ago

Sorry about the link, here's the og link: https://github.com/vrn21/bouvet/

nadis•1w ago

Thanks!

vrn21•1w ago

Sorry for the issue with the link, the accurate link is: https://github.com/vrn21/bouvet

cadamsdotcom•1w ago

You built a voluntary sandbox and it also uses lots of tokens in the context to load in the MCP definition?

Just looking to understand if the sandbox can be bypassed?

Private Inference

Font Rendering from First Principles

Show HN: Seedance 2.0 AI video generator for creators and ecommerce

Wally: A fun, reliable voice assistant in the shape of a penguin

Rewriting Pycparser with the Help of an LLM

Lobsters Vibecoding Challenge

E-Commerce vs. Social Commerce

Avoiding Modern C++ – Anton Mikhailov [video]

Show HN: AegisMind–AI system with 12 brain regions modeled on human neuroscience

Zig – Package Management Workflow Enhancements

AI-powered text correction for macOS

AppSecMaster – Learn Application Security with hands on challenges

Fibonacci Number Certificates

AI Overviews are killing the web search, and there's nothing we can do about it

City skylines need an upgrade in the face of climate stress

1979: The Model World of Robert Symes [video]

Satellites Have a Lot of Room

1980s Farm Crisis

Show HN: FSID - Identifier for files and directories (like ISBN for Books)

Show HN: Holy Grail: Open-Source Autonomous Development Agent

Show HN: Minecraft Creeper meets 90s Tamagotchi

Show HN: Termiteam – Control center for multiple AI agent terminals

The only U.S. particle collider shuts down

Ask HN: Why do purchased B2B email lists still have such poor deliverability?

Show HN: Remotion directory (videos and prompts)

Portable C Compiler

Show HN: Kokki – A "Dual-Core" System Prompt to Reduce LLM Hallucinations

Software Engineering Transformation 2026

Microsoft purges Win11 printer drivers, devices on borrowed time

Lunch with the FT: Tarek Mansour

Private Inference

Font Rendering from First Principles

Show HN: Seedance 2.0 AI video generator for creators and ecommerce

Wally: A fun, reliable voice assistant in the shape of a penguin

Rewriting Pycparser with the Help of an LLM

Lobsters Vibecoding Challenge

E-Commerce vs. Social Commerce

Avoiding Modern C++ – Anton Mikhailov [video]

Show HN: AegisMind–AI system with 12 brain regions modeled on human neuroscience

Zig – Package Management Workflow Enhancements

AI-powered text correction for macOS

AppSecMaster – Learn Application Security with hands on challenges

Fibonacci Number Certificates

AI Overviews are killing the web search, and there's nothing we can do about it

City skylines need an upgrade in the face of climate stress

1979: The Model World of Robert Symes [video]

Satellites Have a Lot of Room

1980s Farm Crisis

Show HN: FSID - Identifier for files and directories (like ISBN for Books)

Show HN: Holy Grail: Open-Source Autonomous Development Agent

Show HN: Minecraft Creeper meets 90s Tamagotchi

Show HN: Termiteam – Control center for multiple AI agent terminals

The only U.S. particle collider shuts down

Ask HN: Why do purchased B2B email lists still have such poor deliverability?

Show HN: Remotion directory (videos and prompts)

Portable C Compiler

Show HN: Kokki – A "Dual-Core" System Prompt to Reduce LLM Hallucinations

Software Engineering Transformation 2026

Microsoft purges Win11 printer drivers, devices on borrowed time

Lunch with the FT: Tarek Mansour

Show HN: I Built a Sandbox for Agents

Comments