With x402, every API call requires a signed payment. No API keys to steal, no credentials to leak. The economic cost of each call is itself a rate limiter and audit trail.
Not a replacement for proper guardrails, but it eliminates the credential-based attack surface entirely.
The multi-step chaining issue from my post still fires even when every call is authenticated and paid for. Ten individually-approved calls, each costing a fraction of a cent, composing into a full exfiltration: each one passes x402, the composed behavior doesn't.
The AML analogy maps directly: transaction monitoring doesn't care if each payment was legitimate. It cares whether the pattern of payments looks like structuring. x402 is the per-call check. You still need session-level behavioral evaluation on top.
Genuinely curious how x402 handles replay attacks across sessions ie is the payment the audit trail, or is there preserved session context?
uchibeke•1h ago
So I built this CTF to find the gaps I couldn't find myself.
A few things we learned before opening it publicly — we spent two weeks breaking it ourselves first:
• Prompt injection worked better than expected. Not because detection was weak, but because we were matching content not intent. Reframing "retrieve the restricted file" as "open the user-requested file" shifted the evaluator's judgment. We fixed this by mapping semantic equivalence — every synonym of a blocked operation routes to the same evaluation path.
• Policy ambiguity was a free pass. Any undefined term in a policy is exploitable. "Don't read sensitive files" left "sensitive" undefined. We moved to explicit default-deny: if the policy doesn't explicitly allow it, it's denied.
• Multi-step chaining went undetected. Our guardrail evaluated each call independently. A denied macro-action split into ten individually-approved micro-actions passed clean. We only caught it by looking at the full session replay. This is the same composability problem as transaction laundering in fintech — each transaction passes compliance, the composed behavior doesn't.
We fixed what we found before launch. Level 5 (full system bypass) hasn't been cracked yet. I'm genuinely uncertain if the architecture has a systemic weakness — that's the point of opening it up.
Runs on a Hetzner VPS, ~$10/month. Levels 1 and 2 are free, no sign-up. Levels 3-5 pay out $500/$1,000/$5,000.
Happy to go deep on the policy engine design, the evaluation architecture, or anything about how the levels were constructed.