frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Boxes.dev: ditch localhost; run Claude Code and Codex in the cloud

https://boxes.dev
44•nab•2h ago
Hi HN, we’re Nick and Drew, and we’re building boxes.dev – the first cloud-only agentic dev environment (ADE) that gives every Codex and Claude Code agent its own cloud computer.

We’re two engineers who previously built Gem (co-founder/CTO and first hire), and we spent the last year coding almost exclusively using Codex and Claude Code. It’s been a huge change to how we code, and it’s been exhilarating seeing the models keep getting better – but we eventually realized that developing on localhost was holding us back:

- Git worktrees are clunky to set up and use for parallelizing work - It’s 2026, but somehow everyone is still walking around with laptops cracked open or SSHing into mac minis in their garage so their agents don’t stop working. - Mobile is treated like an afterthought even though coding is just texting now We started hitting resource constraints when multiple parallel agents test their own work by running the full app locally. - We tried different products, but couldn’t find any that solved all of our pain points – so we pivoted and decided to just build the ADE we wanted for ourselves.

Boxes.dev is a desktop and mobile app that lets you run Claude Code, Codex (using your subscription!), and the full dev environment for whatever you’re building, all on remote compute. It’s similar to Conductor or the Codex desktop app, except everything is in the cloud.

We use coding agents to scan your local dev setup and port it to the cloud. Then every Claude Code/Codex thread starts from a snapshot of the full setup, with its own filesystem and compute. No more git worktrees, no more cracked-open laptops, and your coding agents can actually test their work end-to-end because they can run your full app in isolation.

We’ve mirrored the Claude Code and Codex UX to feel natural to power users, and also have a fully-featured mobile app (no handoffs or remote control), plus scheduled automations and a Slack integration.

We’re obviously biased, but we’ve been building boxes.dev with boxes.dev for months and it’s honestly been a gamechanger. It’s hard to go back once you realize how much localhost has been limiting you; based on early feedback from beta testers, we’re increasingly sure that cloud is the future of agentic coding.

We’d love for you to experience it yourselves! Would appreciate any feedback – and happy to answer any questions on this thread.

Comments

iloveluce•1h ago
Interesting. Given that OpenAI and Anthropic are steadily moving down the stack (e.g. remote execution, Codex desktop, Claude Code integrations), how do you think about defensibility? Do you expect the labs to eventually offer a cloud-native ADE themselves, and if so, what advantage do you think an independent platform retains?

Also, do you see Boxes supporting OpenCode and self-hosted/local models in the future? If the rented machines have enough RAM and GPU access, it seems like there could be an interesting path toward a model-agnostic platform rather than being tied to the frontier labs.

nab•53m ago
A few angles to this. One is that coding just went through a massive change over the past year, that is not yet fully settled. Remember when everyone insisted on using IDEs and seeing the code with a chat sidebar? It's hard to argue you'll still be reading code a year from now. And even today, most people are still developing locally, which we're betting will shift to the cloud over the next few years.

I imagine other players will build cloud support in their own apps, but even now there's a lot of distraction for them. Everyone is trying to still support local execution, which looks really different from cloud. A lot of the labs are taking their coding-focused teams and throwing non-coding on their plates as well (the same app for non-engineers slinging google sheets).

We think getting the cloud experience right for software engineers (as well as companies, with their own hosting/development needs) is going to be really hard, and the problem needs a team fully focused on that. We also think that companies are rightly nervous about putting all their eggs in one basket -- their long term development environment should be harness and model agnostic.

RE OpenCode + self-hosted/local models: definitely. There's nothing holding us back from supporting these since we're just linux machines. But we wanted to start with the most popular harnesses first and go from there.

indigodaddy•1h ago
I might use this if it supported any old cloud or VPS, and was at most $10/mo. The fact that you have decided that this platform should only live in your own custom cloud is unappealing to me.

Or, open source it and let us run it on our own VPS and keep your expensive cloud for those who want to pay. As it stands would never consider it.

nab•35m ago
Thanks a ton for the feedback. Yeah, this is something we'll try to solve in the long term. One of the things that makes this work really smoothly for setup and speed is the ability to have a template box that you can instantly snapshot and fork (disk and RAM) to spin up new machines. There aren't many sandbox providers that do that well for running a full app and development environment, but I'm sure there will be more over time. And the per-second pricing means that you only pay when your agent is running.

You could use VPS, but spinning up and down boxes on inactivity takes a long time, and making changes to the template for new machines is less trivial there. If you're only paying for 1 VPS box, then you lose the "multiple independent machines" benefit, and I imagine things start to get more expensive even in the VPS world when you have 10 of them running at the same time (one per thread).

indigodaddy•15m ago
Pretty sure you could accomplish this in a large physical server or even a huge resource VM (that has KVM passthrough) with some sort of microvm technology? Then that would obviate the need for "multiple cloud instance per coding thread", it would just be a microvm on the large server.

Then again, I'm just the guy running his mouth, and you guys are the ones actually doing the work :)

BTW, looks very polished and thought-through, I may have to still give it a try!

cohix•1h ago
I really like the pricing model and focus on not shafting people by auto-sleeping when an agent is done working.

I’ve been working on an [OSS TUI](https://github.com/prettysmartdev/awman) for managing agent execution and workflows in containers (local or remotely) and would love to collaborate if you’re interested.

__natty__•58m ago
Maybe I’m naive but the longest single workflow I ran was maybe 15 minutes. How do you steer agents to run “overnight”? And what is the quality of such execution?
notrealyme123•56m ago
Usually coding where the closed loop evaluation takes time.

E.g code debugging

nab•49m ago
This. Very few people are doing this right now (probably because it sucks having 5 copies of your app running in parallel on your laptop), but in the past few months models have gotten really good at testing your running app live. If you have an environment where you can run your full app and models can get it at via playwright and chromium, they can click around, take actions, and actually verify that their code works.

With boxes.dev I've starting pushing agents harder to run the full app and test their work end to end, and send me screenshots as proof. This takes time, sometimes up to 30-40 minutes, but is much more likely to be bug free at the end of the day.

Bnjoroge•51m ago
What are “box-hours”? Regular hours just running in boxes? Do I get charged the same when 1)the agent is doing some external thing say web search that takes a while, and 2) when the agent isnt running(say waiting for my input)?
dregitsky•28m ago
It's just one hour of runtime. But we put the machines to sleep very quickly once the agent finishes its work, and then wake when you interact in the UI (e.g. terminal, filesystem, send the agent a followup). We're running on firecracker microVMs so can sleep/wake very quickly, which keeps things nice and responsive.

Re: web searches -- we're running a full linux kernel and the agent runs on the machine itself, so we can't sleep mid run. But conceptually, moving the agent off-box and sleeping during web searches etc would be interesting, but in our experience coding agents are running enough stuff on the machine itself (rg, bash, playwright, etc) that there wouldn't be much savings.

pavelpilyak•50m ago
How does this handle MCP credentials - both for stdio servers that read tokens from local config, and for HTTP ones where harness holds an OAuth token? Either way those secrets end up in your cloud? Curious what the security model is
nab•41m ago
Right now the way you'd do this is you'd select the "Main box" or template VM in the UI, pull up a terminal tab, and authenticate whatever MCPs you care about. These are stored however the MCP is storing them (likely filesystem) on the VM. When you're done, you can "snapshot" the template VM and all future forks/new threads will start from that snapshot of filesystem + RAM.

We recommend you auth with only development credentials (or use something like 2 factor confirmation if you have more sensitive things you want to confirm before the agent accesses), but it's still early for us and we're continuing to refine this as we go. For companies, we're down to brainstorm how they'd like this to ideally work for them. And over the long term we'll support hosting this in your own cloud.

Curious if you have a take on how you'd like this to work from a UX standpoint.

servercobra•45m ago
Nice, this looks exactly like what I've been looking for. I tried Fly.io Sprites and it _almost_ got me there, but I got annoyed logging into my CC every new feature. Unfortunately I wound up going all in on Cursor Cloud Agents, which overall has been decent.

VoidZero Is Joining Cloudflare

https://blog.cloudflare.com/voidzero-joins-cloudflare/
364•coloneltcb•4h ago•186 comments

KVarN: Native vLLM KV-cache quantization back end by Huawei

https://github.com/huawei-csl/KVarN
46•theanonymousone•1h ago•6 comments

Ian's Secure Shoelace Knot

https://www.fieggen.com/shoelace/secureknot.htm
270•mooreds•5h ago•107 comments

Now Is the Best Time to Be a Duct Tape Engineer

https://derwiki.medium.com/now-is-the-best-time-to-be-a-duct-tape-engineer-eefc1d141c23
48•derwiki•3d ago•27 comments

They’re made out of weights

https://maxleiter.com/blog/weights
1165•MaxLeiter•17h ago•499 comments

Zettascale (YC S24) Is Hiring Founding FPGA Engineers

https://www.ycombinator.com/companies/zettascale/jobs/O9S1vqO-founding-engineer-fpga-rtl-asic-arc...
1•el_al•12m ago

Gaussian Point Splatting

https://momentsingraphics.de/Siggraph2026.html
134•ibobev•6h ago•46 comments

U.S. Army Corps of Engineers Bay Model

https://en.wikipedia.org/wiki/U.S._Army_Corps_of_Engineers_Bay_Model
139•tosh•1d ago•39 comments

3D-printed book turns its own G-code into raised lettering

https://www.designboom.com/design/3d-printed-book-manual-darius-ou-benson-chong/
29•surprisetalk•2d ago•14 comments

In a first, wind and solar generated more power than gas globally in April 2026

https://electrek.co/2026/05/20/in-a-first-wind-solar-generated-more-power-than-gas-globally-april...
212•speckx•2h ago•193 comments

Elixir v1.20: Now a gradually typed language

https://elixir-lang.org/blog/2026/06/03/elixir-v1-20-0-released/
911•cloud8421•22h ago•364 comments

French-Iranian author Marjane Satrapi, author of 'Persepolis', dies at 56

https://www.france24.com/en/culture/20260604-french-iranian-author-marjane-satrapi-author-of-pers...
300•fidotron•5h ago•87 comments

Sum-product, unit distances, and number fields

https://www.erdosproblems.com/forum/thread/blog:6
4•robinhouston•3d ago•0 comments

Show HN: Prela – Purely Algebraic Relation Combinators

https://github.com/remysucre/prela
37•remywang•3d ago•6 comments

Gemma 4 12B: A unified, encoder-free multimodal model

https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemma-4-12b/
964•rvz•1d ago•362 comments

Show HN: Open Terminal – A Bloomberg Style App for Research

https://tesseractanalytics.ai/
5•tessbi•1h ago•5 comments

The LLM warnings Google fired Timnit Gebru over have all come true

https://www.tumblr.com/dreaminginthedeepsouth/817865966907228160/darren-oconnor-timnit-gebru-was-...
62•thdr•1h ago•24 comments

I built a vulnerable app and spent $1,500 seeing if LLMs could hack it

https://kasra.blog/blog/i-spent-1500-seeing-if-llms-could-hack-my-app/
332•jc4p•16h ago•175 comments

Artificial intelligence is not conscious

https://www.theatlantic.com/philosophy/2026/06/no-artificial-intelligence-is-not-conscious/687378/
635•lordleft•23h ago•1054 comments

Under Notre Dame, a 'dig of the century' unearths 1,700 years of history

https://apnews.com/article/notre-dame-dig-treasures-paris-archaeology-roman-dae41f792c1402faf32a8...
132•cobbzilla•2d ago•31 comments

Google Employees Internally Share Memes About How Its AI Sucks

https://www.404media.co/google-employees-internally-share-memes-about-how-its-ai-sucks/
108•elorant•1h ago•71 comments

Show HN: Boxes.dev: ditch localhost; run Claude Code and Codex in the cloud

https://boxes.dev
44•nab•2h ago•15 comments

Retro-Tech Parenting

https://havenweb.org/2026/05/28/retro-tech.html
3•mawise•1h ago•0 comments

UK media fails to disclose defence sector links in nearly 60% of cases

https://aoav.org.uk/2026/military-experts-or-arms-industry-insiders-uk-media-fails-to-disclose-de...
339•XzetaU8•8h ago•193 comments

The ways we contain Claude across products

https://www.anthropic.com/engineering/how-we-contain-claude
194•jbredeche•16h ago•85 comments

I was recently diagnosed with anti-NMDA receptor encephalitis

https://burntsushi.net/encephalitis/
702•Tomte•1d ago•226 comments

Uber's $1,500/month AI limit is a useful signal for AI tool pricing

https://simonwillison.net/2026/Jun/3/uber-caps-usage/
574•pdyc•1d ago•701 comments

Learn SQL Once, Use It for 30 Years

https://fagnerbrack.com/learn-sql-once-use-it-for-30-years-9aceb0bdee03
214•karakoram•3d ago•168 comments

thunderbolt-ibverbs: We have InfiniBand at home

https://blog.hellas.ai/blog/thunderbolt-ibverbs/
100•zdw•2d ago•7 comments

Failing grades soar with AI usage, dwindling math skills in Berkeley CS classes

https://www.dailycal.org/news/campus/academics/failing-grades-soar-as-professors-see-greater-ai-u...
576•littlexsparkee•16h ago•545 comments