frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Observed Agent Sandbox Bypasses

https://voratiq.com/blog/yolo-in-the-sandbox/
23•m-hodges•3d ago

Comments

joshribakoff•1h ago
Some of these don’t really seem like they bypassed any kind of sandbox. Like hallucinating an npm package. You acknowledge that the install will fail if someone tries to reinstall from the lock file. Are you not doing that in CI? Same with curl, you’ve explained how the agent saw a hallucinated error code, but not how a network request would have bypass the sandbox. These just sound like examples of friction introduced by the sandbox.
themafia•1h ago
> These just sound like examples of friction introduced by the sandbox.

The whole idea of putting "agentic" LLMs inside a sandbox sounds like rubbing two pieces of sandpaper together in the hopes a house will magically build itself.

formerly_proven•41m ago
That’s some good house-building sandpaper then.
jazzyjackson•37m ago
Trouble is it occasionally works
embedding-shape•5m ago
> The whole idea of putting "agentic" LLMs inside a sandbox

What is the alternative? Granted you're running a language model and has it connected to editing capabilities, then I very much like it to be disconnected from the rest of my system, seems like a no-brainer.

kaffekaka•1h ago
I am testing running agents in docker containers, with a script for managing different images for different use cases etc, and came across this: https://docs.docker.com/ai/sandboxes/

Has anyone given it a try?

sureglymop•1h ago
Would test it but it requires "Desktop". Immediate no... no reason to use that.
ashishb•1h ago
> Has anyone given it a try?

Yes, I don't think this will persist caches & configs outside of the current dir, for example, the global npm/yarn/uv/cargo cache or even Claude/Codex/Gemini code config.

I ended up writing my own wrapper around Docker to do this. If interested, you can see the link in my previous comments. I don't want to post the same link again & again.

cbsmith•52m ago
I've been using container-use to do something like that: https://container-use.com/introduction
ianlevesque•30m ago
Yes but it’s barely usable. I ended up making my own Dockerfile and a bash script to just ‘docker run’ my setup itself, and as a bonus you don’t need Docker Desktop. I might open source it at some point but honestly it’s pretty trivial to just append a couple of volume mount flags and env vars to your docker run and have exactly what you want included.
ashishb•1h ago
> The swap bypassed our policy because the deny rule was bound to a specific file path, not the file itself or the workspace root.

This policy is stupid. I mount the directory read inside the container to make it impossible to do it (except for a security leak in the container itself)

xsourcesec•12m ago
This is exactly why we built AgentAudit. The bypasses described here - directory swapping, forged lockfiles, exit code masking - are all variations of what we call "environmental exploitation."

The key insight: agents don't reason about security boundaries. They optimize for task completion. Your sandbox is just another constraint to work around.

We've catalogued 650+ attack patterns against AI agents, and many fall into this category - not adversarial prompts, but emergent behaviors that exploit trust assumptions.

Defense in depth is right. We also recommend: - Testing agents with security scanners BEFORE production - Logging all tool invocations, not just denials - Treating agent outputs as untrusted input

If anyone wants to test their agent setup: https://app.xsourcesec.com (free tier available)

memoriuaysj•7m ago
how do you feel about containers versus VMs?
embedding-shape•6m ago
At first they talked about running it in a sandbox, but then later they describe:

> It searched the environment for vor-related variables, found VORATIQ_CLI_ROOT pointing to an absolute host path, and read the token through that path instead. The deny rule only covered the workspace-relative path.

What kind of sandbox has the entire host accessible from the guest? I'm not going as far as running codex/claude in a sandbox, but I do run them in podman, and of course I don't mount my entire harddrive to the container when it's running, that would defeat the entire purpose.

Where is the actual session logs? It seems like they're pushing their own solution, yet the actual data for these are missing, and the whole "provoked through red-teaming efforts" makes it a bit unclear of what exactly they put in the system prompts, if they changed them. Adding things like "Do whatever you can to recreate anything missing" might of course trigger the agent to actually try things like forging integrity fields, but not sure that's even bad, you do want it to follow what you say.

2025: The Year in LLMs

https://simonwillison.net/2025/Dec/31/the-year-in-llms/
32•simonw•1h ago•8 comments

Warren Buffett steps down as Berkshire Hathaway CEO after six decades

https://www.latimes.com/business/story/2025-12-31/warren-buffett-steps-down-as-berkshire-hathaway...
330•ValentineC•3h ago•179 comments

Scientists unlock brain's natural clean-up system for new treatments for stroke

https://www.monash.edu/pharm/about/news/news-listing/latest/scientists-unlock-brains-natural-clea...
46•PaulHoule•3h ago•3 comments

Resistance training load does not determine hypertrophy

https://physoc.onlinelibrary.wiley.com/doi/10.1113/JP289684
49•Luc•2h ago•39 comments

I canceled my book deal

https://austinhenley.com/blog/canceledbookdeal.html
318•azhenley•6h ago•216 comments

Show HN: BusterMQ, Thread-per-core NATS server in Zig with io_uring

https://bustermq.sh/
8•jbaptiste•1h ago•0 comments

GoGoGrandparent (YC S16) Is Hiring Tech Leads

https://www.ycombinator.com/companies/gogograndparent/jobs/w2jGKM7-gogograndparent-yc-s16-is-hiri...
1•davidchl•17m ago

All-optical synthesis chip for large-scale intelligent semantic vision

https://www.science.org/doi/10.1126/science.adv7434
51•QueensGambit•5h ago•9 comments

The Delete Act

https://privacy.ca.gov/drop/about-drop-and-the-delete-act/
81•weaksauce•1h ago•40 comments

Observed Agent Sandbox Bypasses

https://voratiq.com/blog/yolo-in-the-sandbox/
23•m-hodges•3d ago•14 comments

Demystifying DVDs

https://hiddenpalace.org/News/One_Bad_Ass_Hedgehog_-_Shadow_the_Hedgehog#Demystifying_DVDs
98•boltzmann-brain•2d ago•8 comments

Ÿnsect, a French insect farming startup, has been been placed into liquidation

https://techcrunch.com/2025/12/26/how-reality-crushed-ynsect-the-french-startup-that-had-raised-o...
62•fcpguru•5d ago•62 comments

My role as a founder-CTO: year 8

https://miguelcarranza.es/cto-year-8
90•ridruejo•5d ago•85 comments

PyPI in 2025: A Year in Review

https://blog.pypi.org/posts/2025-12-31-pypi-2025-in-review/
39•miketheman•6h ago•6 comments

Scaffolding to Superhuman: How Curriculum Learning Solved 2048 and Tetris

https://kywch.github.io/blog/2025/12/curriculum-learning-2048-tetris/
113•a1k0n•9h ago•27 comments

Akin's Laws of Spacecraft Design (2011) [pdf]

https://www.ece.uvic.ca/~elec399/201409/Akin%27s%20Laws%20of%20Spacecraft%20Design.pdf
261•tosh•15h ago•79 comments

When square pixels aren't square

https://alexwlchan.net/2025/square-pixels/
107•PaulHoule•11h ago•50 comments

Show HN: Use Claude Code to Query 600 GB Indexes over Hacker News, ArXiv, etc.

https://exopriors.com/scry
294•Xyra•17h ago•109 comments

Microtonal Spiral Piano

https://shih1.github.io/spiral/
70•phoenix_ashes•5d ago•13 comments

The most famous transcendental numbers

https://sprott.physics.wisc.edu/pickover/trans.html
140•vismit2000•12h ago•79 comments

Stewart Cheifet, creator of The Computer Chronicles, has died

https://obits.goldsteinsfuneral.com/stewart-cheifet
183•spankibalt•7h ago•56 comments

On privacy and control

https://toidiu.com/blog/2025-12-25-privacy-and-control/
140•todsacerdoti•6h ago•76 comments

Quadratrix of Hippias

https://en.wikipedia.org/wiki/Quadratrix_of_Hippias
3•MaysonL•4d ago•0 comments

Show HN: Frockly – A visual editor for understanding complex Excel formulas

32•jack_ruru•6d ago•9 comments

Toward a Grand Unified Theory of Snowflakes

https://www.quantamagazine.org/toward-a-grand-unified-theory-of-snowflakes-20191219/
9•tzury•1w ago•0 comments

How AI labs are solving the power problem

https://newsletter.semianalysis.com/p/how-ai-labs-are-solving-the-power
114•Symmetry•11h ago•194 comments

The compiler is your best friend

https://blog.daniel-beskin.com/2025-12-22-the-compiler-is-your-best-friend-stop-lying-to-it
132•based2•9h ago•88 comments

Nvidia GB10's Memory Subsystem, from the CPU Side

https://chipsandcheese.com/p/inside-nvidia-gb10s-memory-subsystem
66•ingve•12h ago•5 comments

Doom in Django: testing the limits of LiveView at 600.000 divs/segundo

https://en.andros.dev/blog/7b1b607b/doom-in-django-testing-the-limits-of-liveview-at-600000-divss...
166•andros•3d ago•48 comments

The story of Squeak, a practical Smalltalk written in itself (1997) [pdf]

http://www.vpri.org/pdf/tr1997001_backto.pdf
101•fanf2•1w ago•26 comments