However, this design is still under development as it creates quite a bit of challenges.
Every time a project is shared that uses WASM.
We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc
Sandboxes could provide that level of observability, HOWEVER, it is a hard lift. Yet, I don't have better ideas either. Do you?
Solutions no, for now continued cat/mouse with things like "good agents" in the mix (i.e. ai as a judge - of course just as exploitable through prompt injection), and deterministic policy where you can (e.g. OPA/rego).
We should continue to enable better integrations with runtime - why i created the original feature request for hooks in claude code.
1. An LLM given untrusted input produces untrusted output and should only be able to generate something for human review or that's verifiably safe.
2. Even an LLM without malicious input will occasionally do something insane and needs guardrails.
There's a gnarly orchestration problem I don't see anyone working on yet.
friendofmine•3h ago
dawg91•2h ago
verdverm•1h ago
ForHackernews•55m ago