You played yaself
In our company we are just running agents inside isolated containers with isolated network access so it cannot even SSH or fuck up anything even if it gets access into it... That's the only and safest way... inconvenient, true, but the only safe option.
PS: At the same time I've observed this way actually people uses the agent in a more reasonable way, e.g. producing helper scripts to help them with their daily stuff, produce very specific things, create simple PoCs, but they don't commit to vibe-code all the functionality in their corresponding software products.
Are they actually different? I would guess they have roughly the same efficacy. 100% code coverage means nothing, and this is especially true with LLMs.
Now should I mention all the screw up I have seen in several Saas 1b+ valuation, including DocuSign/ and more security oriented ones (PIM related etc?).
For any softwares, you need a minimum critical mindset and experiences that you don't usually see.
Edit: To combat this we need to actually write and understand our code.
_pdp_•46m ago