Show HN: A 3-line wrapper that enforces deterministic security for AI agents

1•tonyww•1h ago

If you are building AI agents with frameworks like browser-use, LangChain, or OpenClaw, you've likely hit the "blast radius" problem.

A misconfigured prompt or hallucination can cause an agent to navigate to a phishing domain, expose an API key, or confidently claim a task succeeded when it actually clicked a disabled button.

The standard fix right now is "LLM-as-a-judge"—taking a screenshot after the fact and asking GPT-4, "Did this work and is it safe?" That introduces massive latency, burns tokens, and is fundamentally probabilistic.

We built predicate-secure to fix this.

It’s a drop-in Python wrapper that adds a deterministic physics engine to your agent's execution loop.

In 3 to 5 lines of code, without rewriting your agent, it enforces a complete three-phase loop:

Pre-execution authorization:

Before the agent's action hits the OS or browser, it is intercepted and evaluated against a local, fail-closed YAML policy. (e.g., Allow browser.click on button#checkout, Deny fs.read on ~/.ssh/*).

Action execution:

The agent executes the raw Playwright/framework action.

Post-execution verification:

It mathematically diffs the "Before" and "After" states (DOM or system) to prove the action succeeded.

To avoid the "LLM-as-a-judge" trap, the execution of the verification is purely mathematical. We use a local, offline LLM (Qwen 2.5 7B Instruct) strictly to generate the verification predicates based on the state changes (e.g., asserting url_contains('example.com') or element_exists('#success')), and then the runtime evaluates those predicates deterministically in milliseconds.

The DX looks like this:

from predicate_secure import SecureAgent from browser_use import Agent

1. Your existing unverified agent

agent = Agent(task="Buy headphones on Amazon", llm=my_model)

2. Drop-in the Predicate wrapper

secure_agent = SecureAgent( agent=agent, policy="policies/shopping.yaml", mode="strict" )

3. Runs with full Pre- & Post-Execution Verification

secure_agent.run()

We have out-of-the-box adapters for browser-use, LangChain, PydanticAI, OpenClaw, and raw Playwright.

Because we know developers hate giving external SaaS tools access to their agent's context, the entire demo and verification loop runs 100% offline on your local machine (tested on Apple Silicon MPS and CUDA).

For enterprise/production fleets, the pre-execution gate can optionally be offloaded to our open-source Rust sidecar (predicate-authorityd) for <1ms policy evaluations.

The repo is open-source (MIT/Apache 2.0). We put together a complete, offline demo showing the wrapper blocking unauthorized navigation and verifying clicks locally using the Qwen 7B model.

Repo and Demo: https://github.com/PredicateSystems/predicate-secure

Another demo for securing your OpenClaw:

https://github.com/PredicateSystems/predicate-claw

Demo (GIF):

https://github.com/PredicateSystems/predicate-claw/blob/main...

I'd love to hear what the community thinks about deterministic verification vs. probabilistic LLM judges, or answer any questions about the architecture!

Comments

selfradiance•1h ago

Interesting approach. The deterministic verification vs. LLM-as-judge choice is the right call — probabilistic safety checks on safety-critical actions is a category error. One thing I've been thinking about: policy-based pre-execution authorization handles the prevention side well, but there's a complementary problem — what happens when an agent operates across trust boundaries where you can't predefine every allowed action? I've been exploring an economic accountability model (bond-and-slash, inspired by crypto staking) where agents post collateral that gets slashed on verified misbehavior. Prevention + accountability as two layers rather than one. Repo if anyone's curious about the other side of this: https://github.com/selfradiance/agentgate

How Predictable Are the Oscars?

Revealed: Face of 75,000-year-old female Neanderthal from cave

AI agent 'lobster fever' grips China despite risks

LDP: Identity-Aware Routing for Multi-Agent LLMs – 37% Less Tokens

When code is free, research is all that matters

Lessons from scaling ClickHouse to petabytes of AI observability data

Self-Driving Corporations (2020)

The Colorado River Does Not Reach 2030

I built a GDPR analytics alternative to Google Analytics

Lost in Backpropagation: The LM Head Is a Gradient Bottleneck

The web in 1000 lines of C

Treat Interfaces as Organizational Treaties

Open source UnigetUI bought by Devolutions Inc

Ask HN: Best Practices for Agent Airgapping?

Ask HN: App for clean movie/TV shorts?

How an Electrician from Kentucky Built an AI Startup with Claude

Who's a Better Writer: A.I. Or Humans? Take Our Quiz

Show HN: MaximusLLM, Breaking transformer's O(N^2) and O(V) scaling bottlenecks

Show HN: We built a billion row spreadsheet

Indiehacking: Lessons from 9K USD in Facebook Ad Spend [video]

Show HN: Thermal Receipt Printers – Markdown and Web UI

Harness Engineering for Coding Agents

Amazon AI Outage Financial Times Correction

Bike and walking trails lose millions under Trump

Xnxx

Show HN: Re-imagine photo albums with NanoBanana

Darn Grid Shader (Yet). For as long as I've been writing

Designing AI agents to resist prompt injection

How to Shut Down Meta

Mozilla Data Collective