Web app submits the prompt, a sandbox starts on sprites.dev and any Claude output in the sandbox gets piped to the web app for display.
Not sure I can open source it as it's something I built for a client, but ask if you have any questions.
But. This tool is scarily good. I’m seeing it “1-shot” features in a fairly sizable code base and fixes with better code and accuracy than me.
Create the right agentic feedback loop and a reasoning model can perform far better through iteration than its first 1-shot attempt.
This is very human. How much code can you reliable write without any feedback? Very little. We iterate, guided by feedback (compiler, linter, executing and exploring)
But honestly? This here really is something.
I can vividly imagine how in a not too far future, there will only be two types of product companies: those that work like this, and those that don’t — and vanish.
Edit: To provide a less breathless take myself:
What I can very realistically imagine is that just like today sane and level-headed startups go „let’s first set up some decent infrastructure-as-code, a continuous delivery pipeline, and a solid testing framework, and then start building the product for good“, in the future sane and level-headed startups will go „let’s first set up some decent infrastructure-as-code, a continuous delivery pipeline, a solid testing framework, and a Ramp-style background agent — and then start building the product for good“.
If it really does work I expect there will be many paid and open source variants that other companies can adopt into their workflows. So I'll patiently wait for the outcomes before trying something like this, but I'm glad someone is.
Surprised they need both.
I wonder if we're at the point where the cost of building and maintaining this yourselves (assisted with an AI Copilot) is now more effective than an off-the-shelf?
It feels like there's a LOT of moving parts here, but also it's deeply tailored to their own setup.
FWIW - I tried pointing Claude at the post and asking it to design an implementation, (like the post said to do) and it struggled - but perhaps I prompted it wrong.
If you need a queue, lpd. If you need scheduling, cron. If you need backups, tar. If you need to communicate, email and irc. If you need to remote any of those, ssh.
Things shouldn't be difficult, yet they are.
Also, inevitably these AI companies will start selling out data and become part of the surveillance state, if they're not already.
Claude code locally in a vm and/or with work trees will 1 shot far better without burning cloud infra cash.
I’d bet this ends up wasting more money and time than it’s worth in practice.
A day of work to get the prototype working and a few hours the next day to allow multiple users to authenticate.
It's surprisingly simple.
sdwr•3w ago