frontpage.

Show HN: I applied Markowitz port. theory to agent teams / proved it in a zkVM

https://www.mnemom.ai/showcase

2•alexgarden•1h ago

I run multi-agent teams in high-consequence scenarios. Read: fuckups at 3 AM = I'm awake.

I kept hitting the same issue. I couldn't get a rules-based system to enforce behavior and I had no real way to prove that agents really did what they said they did. I can log and monitor them - set up (a million) Slack alerts but none of these things are PROOF. Logs are mutable. And that matters more every day as agents get more powerful (take THAT, @meta)

So I went down the rabbit hole.

The obvious answer is zero-knowledge proofs. Prove behavior cryptographically. Except proving an LLM inference in a zkVM is computationally Star Trek. Lagrange proved GPT-2 E2E, Polyhedra can do Llama-3 at 150 seconds per token — production-scale is still hours, not seconds.

The a-ha: I don't need to prove the model is correct. I need to prove the auditor is honest.

My system intercepts agent thinking blocks (Claude, OpenAI, Gemini), analyzes them against a behavioral contract, and produces a verdict: clear, review needed, or boundary violation. That derivation is deterministic — ~10,000 RISC-V cycles. Provable today.

So I built a guest program inside SP1's zkVM (on Modal) that re-derives the verdict from scratch, ignoring what the auditor claimed, and generates a STARK proof. If the auditor said "clear" but the evidence warranted "violation," the proof fails. The auditor cannot lie.

Quis custodiet ipsos custodes (Who watches the watchmen?) — answered with math. Sub-second on GPU.

OK... pretty cool, but what ELSE can you do with it?

Great question! Once I had provable individual verdicts: what about teams? Can I prove the group is safe?

I ended up applying financial risk theory to AI agent fleets (things I never expected to be doing with my life). CoVaR for tail risk — one bad agent in a group of four good ones doesn't average out to "fine." Markowitz portfolio theory for coherence — treating value alignment like diversification. DebtRank for contagion — if Agent A fails, who's exposed? Originally designed for bank failures. Works disturbingly well for agents.

Then I needed Shapley attribution for individual risk. Except real Shapley is exponential (2^n subsets), and Monte Carlo introduces randomness. Randomness = non-determinism = unprovable in a zkVM. Leave-One-Out approximation: deterministic, O(n²), the only Shapley variant that works inside a prover.

Oh, and all of it runs in Q16.16 fixed-point arithmetic (i32) because floating-point produces different results on different architectures, and "different results" inside a zkVM = worthless proof. I implemented exp, sqrt, and clamp from scratch in integer math. Casting spells at 2 AM in the dark again.

The whole stack — CoVaR, Markowitz, DebtRank, Shapley, circuit breakers — computes in TypeScript on Cloudflare Workers (instant), then re-derives in Rust inside the zkVM (provable). Both produce identical results. If they don't, something is very wrong.

So what?

Every agent accumulates cryptographically attested checkpoints — Ed25519 signatures, SHA-256 hash chains, Merkle trees, STARK proofs — and earns a Trust Score. Credit rating for AI agents, AAA to CCC. The score isn't an opinion. It's a computation over evidence anyone can independently verify. FICO computes scores from data you can't inspect. This computes scores from data anyone can cryptographically verify.

Everything I described here is live code. Four agents handling a production incident — coherence matrix, trust topology, Merkle visualization, drift detection: https://mnemom.ai/showcase

Apache-licensed. Zero-code gateway: npm install -g @mnemom/smoltbot && smoltbot register

GitHub: github.com/mnemom | Docs: docs.mnemom.ai

Show HN: OpenPawz – Open-source desktop AI agent platform (Rust/Tauri, 75 tools)

'Birdbrain' benefits: How being an expert birdwatcher may boost cognition

Tell HN: GPT-5.3-codex is now available on the API

Show HN: I proved AI Model Collapse is a topological inevitability

MatX inference chip 500M raise

The AI-Augmented Scientist

I built an engine to migrate Oracle PL/SQL to Java

Show HN: Intellegix HN Daily – AI podcast that reads HN's top stories aloud

Ezra and Jack Clark on Agents

How the NBA can fix tanking

State of Clojure Surveys

MCPs just got a front end, and it's a bigger deal than it sounds

Crunchbase Data: AI Boom Has Changed Who's Funding Companies in 2025 vs. 2021

Show HN: Free AI-Powered Tools (writing, SEO, marketing, dev tools)

Robert Carradine Dies at 71

AI can help startups define their ICP

The True Face of Prompt Injection

Show HN: A simple, free web app to track my portfolio across brokers

Show HN: Brainstorm-MCP – Let GPT, DeepSeek, and Groq Brainstorm Together

Looks Like it is Happening

Ever wondered how Commodore 64 pixel art is still evolving today?

Hosting the Olympics: the most expensive participation trophy

Show HN: Disk Inventory X updated for Apple Silicon

Anthropic's Responsible Scaling Policy: Version 3.0

The End of Baseload Power as We Know It

Apple's Touch-Screen MacBook Pro to Have Dynamic Island, New Interface

Porting TeX from Pascal to TypeScript Using LLMs

Apple Accelerates US Manufacturing

Show HN: Bookie – Conquer the bookkeeping and accounting chaos of freelancing

I built The Murderer's Lock – a security vault with a unique approach