AI Agents Can Move Money. Lobstar Proved They Can Loose It. Science Can Help

https://blog.icme.io/ai-agents-can-move-money-lobstar-wilde-proved-they-can-lose-it-too/

2•wyattbenno•8h ago

Comments

wyattbenno•8h ago

Hi,

AI Agents Can Move Money. Lobstar Wilde Proved They Can Lose It Too...

So we made: (ARc + ZK) Neurosymbolic formal reasoning w/ zero knowledge proofs for cryptographically secure agentic guardrails. Our approach has two essential properties for agentic commerce: (1) succinct proofs and (2) optional privacy.

The core idea is from the Automated Reasoning Checks (ARc) paper (https://arxiv.org/pdf/2511.09008) which converts natural language policies into SMT-LIB constraints and checks agent actions against an SMT solver rather than an LLM judge. The solver proofs are not succinctly verifiable (10 mins to run takes 10 mins to check), by wrapping this with ZKP we get succinctly verifiable proofs (under 1s) that can be checked on constrained devices or by agents directly. ARc proofs would require the policy to be public.. ZK allows them to be private. We wrote about the ZK tech here (https://arxiv.org/abs/2602.17452). *Thank you AWS AR team for useful discussions and work in this field!

The motivation was incidents like Lobstar Wilde where an AI agent sent ~$442K in tokens to a stranger because there were no hard constraints or cryptographic guarantees. Prompt guardrails and LLM judges are arguable. A formal logic (SAT/UNSAT) proof is not.

How it works:

POST your policy in plain English (contracts, user support docs whatever) to /v1/makeRules. We compile it to SMT-LIB via a reasoning model and run consistency checks from the ARc paper.

POST agent actions to /v1/checkIt with your policy_id and inference output that you want to check against guardrails. The action is formalized and checked against the solver. Returns SAT or UNSAT.

Soundness is over 99% on unseen datasets per the paper. This matters because LLM-based guardrails degrade under adversarial pressure — the attack surface is the model's judgment itself. A sufficiently creative input can argue its way past any prompt-based guardrail. The ARc approach removes that attack surface entirely!

There is also a game running in our blog, where you can try and bypass the guardrails on a simple 5 rule model defending Lobstar Wilde.

The solver does not have judgment, it cannot be persuaded, and its accuracy does not drop when someone is actively trying to fool it. Good luck hacking!

The longer writeup covers the ARc paper, why LLM judges and other approaches don't work as well, how our zero knowledge machine learning (zkML) integration works for succinct verifiable proofs with private policy cryptographic audit trails, and policy failure modes to avoid.

Happy to answer questions about the ARc + zkML compilation approach, the x402 payment flow, or anything else.

The current system does use USDC to future proof it for agent-to-agent interactions with no human-in-the-loop (Credit cards are for humans. Stable coins are for agents).

Thanks!

Full blog post:

https://blog.icme.io/ai-agents-can-move-money-lobstar-wilde-...

Show HN: MTile – native macOS window tiler inspired by gTile

Show HN: Personalized financial literacy book for your kid

Ask HN: Has anyone built an autonomous AI operator for their side projects?

Obituary for António Lobo Antunes

The legendary Mojave Phone Booth is back (2013)

Autonomous AI Newsroom

People love to hate twice-a-year clock change but can't agree on how to fix it

To be a better programmer, write little proofs in your head

Show HN: ScreenTranslate – On-device screen translator for macOS (open source)

A New Way to Synthesize Peptides (2024)

Report from Vietnam (1968) Walter Cronkite [video]

Airtable: Rewriting Our Database in Rust

A workflow driven web framework for Clojure

Show HN: An AI-powered digital night vision system with drone video feed

The Start-Stop Problem

Show HN: PlateSpinner – A Kanban board that orchestrates AI coding agents

AI startup sues ex-CEO, saying he took 41GB of email and lied on Résumé

Technical Beauty: FreeBSD Jails

this css proves me human

UK Gambling Commission explores how to keep bettors on licensed sites

An IRC bot spawned the most prolific software [video]

Show HN: Auto-Co – 14 AI agents that run a startup autonomously (open source)

Aldous Huxley on the Power of Technology (1961) [video]

Countries moving to ban social media for children

White Collar Goes Blue

Research Shows Models Know Answers Before Finishing Chain-of-Thought Reasoning

Open-source LLM router with Thompson Sampling and energy-aware routing

Art Bits from Apple Macintosh HyperCard

Aigate – An OS-level sandbox for AI coding agents (Claude, Cursor)

Fed clarifies capital rules for tokenized securities