frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

I'm 15 and built a free tool for reading Greek/Latin texts. Would love feedback

https://the-lexicon-project.netlify.app/
1•breadwithjam•2m ago•1 comments

How close is AI to taking my job?

https://epoch.ai/gradient-updates/how-close-is-ai-to-taking-my-job
1•cjbarber•3m ago•0 comments

You are the reason I am not reviewing this PR

https://github.com/NixOS/nixpkgs/pull/479442
2•midzer•4m ago•1 comments

Show HN: FamilyMemories.video – Turn static old photos into 5s AI videos

https://familymemories.video
1•tareq_•6m ago•0 comments

How Meta Made Linux a Planet-Scale Load Balancer

https://softwarefrontier.substack.com/p/how-meta-turned-the-linux-kernel
1•CortexFlow•6m ago•0 comments

A Turing Test for AI Coding

https://t-cadet.github.io/programming-wisdom/#2026-02-06-a-turing-test-for-ai-coding
2•phi-system•6m ago•0 comments

How to Identify and Eliminate Unused AWS Resources

https://medium.com/@vkelk/how-to-identify-and-eliminate-unused-aws-resources-b0e2040b4de8
2•vkelk•7m ago•0 comments

A2CDVI – HDMI output from from the Apple IIc's digital video output connector

https://github.com/MrTechGadget/A2C_DVI_SMD
2•mmoogle•7m ago•0 comments

CLI for Common Playwright Actions

https://github.com/microsoft/playwright-cli
3•saikatsg•9m ago•0 comments

Would you use an e-commerce platform that shares transaction fees with users?

https://moondala.one/
2•HamoodBahzar•10m ago•1 comments

Show HN: SafeClaw – a way to manage multiple Claude Code instances in containers

https://github.com/ykdojo/safeclaw
2•ykdojo•13m ago•0 comments

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment-blog-3
3•gmays•14m ago•0 comments

The Evolution of the Interface

https://www.asktog.com/columns/038MacUITrends.html
2•dhruv3006•15m ago•1 comments

Azure: Virtual network routing appliance overview

https://learn.microsoft.com/en-us/azure/virtual-network/virtual-network-routing-appliance-overview
2•mariuz•16m ago•0 comments

Seedance2 – multi-shot AI video generation

https://www.genstory.app/story-template/seedance2-ai-story-generator
2•RyanMu•19m ago•1 comments

Πfs – The Data-Free Filesystem

https://github.com/philipl/pifs
2•ravenical•22m ago•0 comments

Go-busybox: A sandboxable port of busybox for AI agents

https://github.com/rcarmo/go-busybox
3•rcarmo•23m ago•0 comments

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

https://research.nvidia.com/labs/nemotron/files/NVFP4-QAD-Report.pdf
2•gmays•24m ago•0 comments

xAI Merger Poses Bigger Threat to OpenAI, Anthropic

https://www.bloomberg.com/news/newsletters/2026-02-03/musk-s-xai-merger-poses-bigger-threat-to-op...
2•andsoitis•24m ago•0 comments

Atlas Airborne (Boston Dynamics and RAI Institute) [video]

https://www.youtube.com/watch?v=UNorxwlZlFk
2•lysace•25m ago•0 comments

Zen Tools

http://postmake.io/zen-list
2•Malfunction92•28m ago•0 comments

Is the Detachment in the Room? – Agents, Cruelty, and Empathy

https://hailey.at/posts/3mear2n7v3k2r
2•carnevalem•28m ago•1 comments

The purpose of Continuous Integration is to fail

https://blog.nix-ci.com/post/2026-02-05_the-purpose-of-ci-is-to-fail
1•zdw•30m ago•0 comments

Apfelstrudel: Live coding music environment with AI agent chat

https://github.com/rcarmo/apfelstrudel
2•rcarmo•31m ago•0 comments

What Is Stoicism?

https://stoacentral.com/guides/what-is-stoicism
3•0xmattf•32m ago•0 comments

What happens when a neighborhood is built around a farm

https://grist.org/cities/what-happens-when-a-neighborhood-is-built-around-a-farm/
1•Brajeshwar•32m ago•0 comments

Every major galaxy is speeding away from the Milky Way, except one

https://www.livescience.com/space/cosmology/every-major-galaxy-is-speeding-away-from-the-milky-wa...
3•Brajeshwar•32m ago•0 comments

Extreme Inequality Presages the Revolt Against It

https://www.noemamag.com/extreme-inequality-presages-the-revolt-against-it/
2•Brajeshwar•32m ago•0 comments

There's no such thing as "tech" (Ten years later)

1•dtjb•33m ago•0 comments

What Really Killed Flash Player: A Six-Year Campaign of Deliberate Platform Work

https://medium.com/@aglaforge/what-really-killed-flash-player-a-six-year-campaign-of-deliberate-p...
1•jbegley•33m ago•0 comments
Open in hackernews

Ask HN: How are you sandboxing coding agents?

46•m-hodges•1mo ago
I've seen people rely on built-in sandboxes, use git worktrees (sometimes inside devcontainers), or run the whole agent inside a Linux VM with minimal host mounts. On Linux, I’ve also seen firejail/bubblewrap mentioned.

For folks actually using these tools day-to-day:

What’s your default setup?

Have you had any "learned the hard way" moments?

What tradeoff (safety vs convenience vs parallelism) has mattered most in practice?

I'm less interested in theoretical best practices than what's actually holding up under real use.

Comments

netcoyote•1mo ago
I use a Mac, and wanted to be able to run MacOS programs like Xcode and iOS simulator, so I wrote a couple of different sandbox projects:

- SandVault (https://github.com/webcoyote/sandvault) runs the AI agent in a low-privilege account

- ClodPod (https://github.com/webcoyote/clodpod) runs the AI agent inside a MacOS VM

In both cases I map my code directories using shares/mounts.

I find that I use the low-privilege account solution more because it's easier to setup and doesn't require the overhead of a full VM

tmaly•1mo ago
do you have a write up on your setup?
sixhobbits•1mo ago
I have time machine and just let them fly with --dangerously-skip-permissions on my Mac. Worst thing it's done is back up a database, delete the database, and then run git clean locally which also wiped out the backup, so I'm not saying there are no dangers but honestly I've made worse mistakes and probably more frequently so I generally trust Claude with about the same level of access as me now.

Most common is deleting files etc but if you're using git and have backups it's barely noticeable

OJFord•1mo ago
How are you going to notice that while working on ~/projects/acme3000 it for some reason deleted ~/photos/2003/once-in-a-lifetime-holiday/?

Backups are great when you know you need to restore.

Wowfunhappy•1mo ago
I could ask this question without AI. How are you going to notice that while you were working on ~/projects/acme3000, you for some reason deleted ~/photos/2003/once-in-a-lifetime-holiday/?

Of course, AI is not a real person, and it does make mistakes that you or I probably would not. However, this class of mistake—deleting completely unrelated directories—does not appear to be a common failure mode. (Something like deleting all of ~ doesn’t count here—that would be immediately noticeable and could be restored from a backup.)

(Disclaimer, I’m not OP and I wouldn’t run Claude with —-dangerously-skip-permissions on my own system)

gspetr•1mo ago
Isn't the problem that of finding out a consistency heuristic? For example, test that the resulting state is consistent with your test suite.

If it is a directory that gets deleted, then you can diff it with a previous state. If you don't control the state and don't know the surface area that you should observe, then yes, you're inviting trouble if agents run amok.

estimator7292•1mo ago
Yeah I've got hourly backups out to multiple remote servers. My dev machine is in essence fungible. If it gets hosed, I'll wipe the drive and drop a good backup in. If it catches fire, I'll pick up a different machine and drop in the good backup.

I have more important things to waste my time on than writing absurd sandboxes to run AI agents without guardrails in. What even?

gl-prod•1mo ago
I spin a Firecracker VM with a custom image that has all the things I need.
stavros•1mo ago
I wrote a small utility that wraps commands in Docker: https://github.com/skorokithakis/dox
jomcgi•1mo ago
I have a web ui for managing / interacting with opencode sessions. Everything runs as a pod in my homelab cluster so I can let them "bypass" permissions and just restrict the pods.

I wanted something like Claude code web with access to more models / local LLMs / my monorepo tooling, so far it's been great.

The output is a PR so it's hard for it to break anything.

The biggest benefit is probably that it makes it easier to start stuff when I'm out - feels like a much better use of downtime like I'm not waiting to get home to start a session after I have an idea.

The monorepo tooling is a bit win too, for a bunch of things I just have 1 way to do it and clear instructions for them to use the binaries that get bundled into new sessions so it gets things "right" more often.

aussieguy1234•1mo ago
I run vscode based agents in Linux, mostly Kilo Code

After a bit of tinkering I was able to get it to all run fine in Firejail, I wrote a guide here https://softwareengineeringstandard.com/2025/12/15/ai-agents...

Fairly basic, limits the agents write access to my projects, all of which are backed up in git.

techsystems•1mo ago
Thanks for the share, but I'm having a hard time understanding this.

On step 2, it's only jailing VS Code. Shouldn't it also jail the Git repo you're working on (and disable `git push` somehow), as well as all the env libs?

Also, isn't the point of this to auto approve everything?

yomismoaqui•1mo ago
Using Claude Code and Amp (free mode) with no sandbox.

I don't run Claude Code in YOLO mode, I just approve commands the first time I'm asked about them.

Using them since July I haven't found any problem with data loss and the clanker have not tried to delete my $HOME.

notarobot123•1mo ago
I do similar but it's incredible how our threat model has changed so much to allow this. I have to trust this one node package (and all its dependencies) and Anthropic more than I trust my email provider, my ISP or my browser.

Who'd have imagined remote code execution as a service would have caught on as much as it has!

sevenseacat•1mo ago
This is why I don't use Claude Code on my personal machine. My work machine, sure, my work encourages that. My personal machine, I use Claude through Zed with an API key, and manually approve every command.
foreigner•1mo ago
I'm using Catnip (https://github.com/wandb/catnip). It runs Claude Code in YOLO mode inside a Docker container, and also manages multiple Claude instances running in Git worktrees. I'm pretty happy with it but would be happier if it addressed limiting network access to guard against exfiltration.
Havoc•1mo ago
For CC - unprivileged LXC on a proxmox server. That's enough to catch mishaps like deleting all your sht while still being a reasonable transparent isolation layer. Plus my entire homesetup is geared towards LXC anyway.

Keen to give firecracker another go though. Last I explored that it still felt pretty rough. (on UX not tech quality)

solresol•1mo ago
I create a separate Linux user (which doesn't have sudo rights) for each project. I have to log each user in to Claude code or codex, but then I can use ordinary Unix permissions to keep the bots under control and isolated.
zmj•1mo ago
devcontainers, without credentials to the git remote.
languid-photic•1mo ago
> Have you had any "learned the hard way" moments?

A big lesson for us is that you still need to be careful even in a sandbox.

We've been running Claude/Codex/Gemini in sandboxed YOLO mode and have seen some interesting bypass attempts. [1]

A few examples:

- created fake npm tarballs and forged SHA‑512s in our package‑lock.json

- masked failures with `|| true`, making blocked operations look successful

- cloned a workspace, edited the clone, then replaced the workspace w the clone to bypass file‑path deny rules

So, we’ve learned to default to verbose logging, patch bypasses as we see them, and try to keep iteration loops short.

[1] https://voratiq.com/blog/yolo-in-the-sandbox/

kasey_junk•1mo ago
I watched Claude download the rust toolchain and build a user land networking stack to get around some container sandboxing restrictions I had in place. Tbf to Claude I was prompting it in ways that were not explicitly to get it to do this but were intentionally putting it in conflict with the sandboxing.
languid-photic•1mo ago
Yes, typically the agent is just trying to do what it's been instructed to do, but sometimes it's too naive to realize its approach is a bit sketchy.

And actually, one way we've hardened our sandbox is by tasking agents with impossible tasks (within the sandbox), then analyzing and patching each workaround.

gverrilla•1mo ago
is firejail safe to use for this purpose? any tips?
mac-attack•4w ago
This was my initial perspective as well. Given that there are no profiles, I will likely have to pivot to something else
scuff3d•1mo ago
I feel like a crazy person reading these comments, "oh it tries to bypass limitations, delete files, and generally nuke my system... But it's cool, I trust it"
subsection1h•1mo ago
Exactly. Also, it's not clear to me if some of these people think that containers are a sandbox or they simply don't care about security.

For anyone out there who thinks that containers are a sandbox...

There's a reason why gVisor exists:

https://github.com/google/gvisor#why-does-gvisor-exist

There's a reason why secureblue doesn't use containers:

https://news.ycombinator.com/item?id=45045190

There's a reason why Qubes OS doesn't use containers.

jq-r•1mo ago
Claude Code in yolo mode with Docker Sandboxes https://docs.docker.com/ai/sandboxes/
___timor___•1mo ago
That's something new. I'll have to try it Thanks!
___timor___•1mo ago
Containers work quite well and fast. https://gagor.pro/2025/10/running-gemini-cli-in-a-docker-con...
throwayaw84330•1mo ago
I use https://github.com/longregen/claude-sandbox

It uses bubblewrap (no root needed) and only exposes ~/.cache stuff and the current folder (no git credentials, no ssh credentials, and as few permissions as it's feasible).

bubblewrap is a little bit more lightweight than docker (afaiu no overlayfs, launches way faster), but has the same underlying mechanisms for security (cgroups)

jacob019•1mo ago
Funny you should mention this, I just added a simple filesystem sandbox to my coding agent.

Check it out: https://github.com/jacobsparts/agentlib/blob/main/src/agentl...

The framework is all python, but I used C for this helper. It uses unprivileged user namespaces to mount an overlay and run an arbitrary command, then when the command finishes, it writes a tarball of edits, which I use to create a unified diff. The framework orchestrates it all transparently, but the helper itself could be used standalone. Here's a short document about the sandbox in the context of it's use in my project:

https://github.com/jacobsparts/agentlib/blob/main/docs/sandb...

I also have a version that uses SUID instead of unprivileged user namespaces, available by request.

I often use claude code with --dangerously-skip-permissions but every once in a while it bites me. I've learned to use git for everything and put instructions to always commit BEFORE writes in CLAUDE.md. Claude can go off the rails on harder bug fixes, especially if there are multiple rounds of context compacting, it can really screw things up. It usually honors guidance not to modify outside of the project, but a simple sandbox adds so much, after the session is over you can see what changed and decide what to do with it. It really helps with the problem where it makes unexpected changes to the codebase, which you might not even notice otherwise, which can introduce serious bugs. The permission models of all the coding agents are rough--either you can't get anything done, or you throw caution to the wind. Full sandboxes are quite restrictive, which is why I rolled by own. Honestly your best option right now is just to have good version control and run coding agents in dedicated environments.

onetimer1•1mo ago
I run Windsurf in unprivileged podman [0], and only mount what is strictly necessary; I do the same with Claude

[0] https://github.com/grzegorzk/codeium_windsurf_in_podman