Hey HN. I'm Luke, security engineer and creator of Sigstore and other open source security projects. I've been building nono, an open source sandbox for AI coding agents that uses kernel-level enforcement (Landlock/Seatbelt) to restrict what agents can do on your machine.
One thing that's been bugging me: we give agents our API keys as environment variables, and a single prompt injection can exfiltrate them via env, /proc/PID/environ, or just an outbound HTTP call. The blast radius is the full scope of that key.
So we built what we're calling the "phantom token pattern" — a credential injection proxy that sits outside the sandbox. The agent never sees real credentials. It gets a per-session token that only works only with the session bound localhost proxy. The proxy validates the token (constant-time), strips it, injects the real credential, and forwards upstream over TLS. If the agent is fully compromised, there's nothing worth stealing.
Real credentials live in the system keystore (macOS Keychain / Linux Secret Service), memory is zeroized on drop, and DNS resolution is pinned to prevent rebinding attacks. It works transparently with OpenAI, Anthropic, and Gemini SDKs — they just follow the *_BASE_URL env vars to the proxy.
Blog post walks through the architecture, the token swap flow, and how to set it up. Would love feedback from anyone thinking about agent credential security.
decodebytes•2h ago
One thing that's been bugging me: we give agents our API keys as environment variables, and a single prompt injection can exfiltrate them via env, /proc/PID/environ, or just an outbound HTTP call. The blast radius is the full scope of that key.
So we built what we're calling the "phantom token pattern" — a credential injection proxy that sits outside the sandbox. The agent never sees real credentials. It gets a per-session token that only works only with the session bound localhost proxy. The proxy validates the token (constant-time), strips it, injects the real credential, and forwards upstream over TLS. If the agent is fully compromised, there's nothing worth stealing.
Real credentials live in the system keystore (macOS Keychain / Linux Secret Service), memory is zeroized on drop, and DNS resolution is pinned to prevent rebinding attacks. It works transparently with OpenAI, Anthropic, and Gemini SDKs — they just follow the *_BASE_URL env vars to the proxy.
Blog post walks through the architecture, the token swap flow, and how to set it up. Would love feedback from anyone thinking about agent credential security.
https://nono.sh/blog/blog-credential-injection
We also have other features we have shipped, such as atomic rollbacks, Sigstore based SKILL attestation.
https://github.com/always-further/nono