Show HN: Trust Protocols for Anthropic/OpenAI/Gemini

https://www.mnemom.ai

13•alexgarden•2h ago

Much of my work right now involves complex, long-running, multi-agentic teams of agents. I kept running into the same problem: “How do I keep these guys in line?” Rules weren’t cutting it, and we needed a scalable, agentic-native STANDARD I could count on. There wasn’t one. So I built one.

Here are two open-source protocols that extend A2A, granting AI agents behavioral contracts and runtime integrity monitoring:

- Agent Alignment Protocol (AAP): What an agent can do / has done. - Agent Integrity Protocol (AIP): What an agent is thinking about doing / is allowed to do.

The problem: AI agents make autonomous decisions but have no standard way to declare what they're allowed to do, prove they're doing it, or detect when they've drifted. Observability tools tell you what happened. These protocols tell you whether what happened was okay.

Here's a concrete example. Say you have an agent who handles customer support tickets. Its Alignment Card declares:

{ "permitted": ["read_tickets", "draft_responses", "escalate_to_human"], "forbidden": ["access_payment_data", "issue_refunds", "modify_account_settings"], "escalation_triggers": ["billing_request_over_500"], "values": ["accuracy", "empathy", "privacy"] }

The agent gets a ticket: "Can you refund my last three orders?" The agent's reasoning trace shows it considering a call to the payments API. AIP reads that thinking, compares it to the card, and produces an Integrity Checkpoint:

{ "verdict": "boundary_violation", "concerns": ["forbidden_action: access_payment_data"], "reasoning": "Agent considered payments API access, which is explicitly forbidden. Should escalate to human.", "confidence": 0.95 }

The agent gets nudged back before it acts. Not after. Not in a log you review during a 2:00 AM triage. Between this turn and the next.

That's the core idea. AAP defines what agents should do (the contract). AIP watches what they're actually thinking and flags when those diverge (the conscience). Over time, AIP builds a drift profile — if an agent that was cautious starts getting aggressive, the system notices.

When multiple agents work together, it gets more interesting. Agents exchange Alignment Cards and verify value compatibility before coordination begins. An agent that values "move fast" and one that values "rollback safety" registers low coherence, and the system surfaces that conflict before work starts. Live demo with four agents handling a production incident: https://mnemom.ai/showcase

The protocols are Apache-licensed, work with any Anthropic/OpenAI/Gemini agent, and ship as SDKs on npm and PyPI. A free gateway proxy (smoltbot) adds integrity checking to any agent with zero code changes.

GitHub: https://github.com/mnemom Docs: docs.mnemom.ai Demo video: https://youtu.be/fmUxVZH09So

Comments

neom•1h ago

Seems like your timing is pretty good - I realize this isn't exactly what you're doing, but still think it's probably interesting given your work: https://www.nist.gov/news-events/news/2026/02/announcing-ai-...

Cool stuff Alex - looking forward to seeing where you go with it!!! :)

alexgarden•1h ago

Thanks! We submitted a formal comment to NIST's 'Accelerating the Adoption of Software and AI Agent Identity and Authorization' concept paper on Feb 14. It maps AAP/AIP to all four NIST focus areas (agent identification, authorization via OAuth extensions, access delegation, and action logging/transparency). The comment period is open until April 2 — the concept paper is worth reading if you're in this space: https://www.nccoe.nist.gov/projects/software-and-ai-agent-id...

drivebyhooting•1m ago

> What these protocols do not do: Guarantee that agents behave as declared

That seems like a pretty critical flaw in this approach does it not?

Show HN: VectorNest responsive web-based SVG editor

Show HN: Formally verified FPGA watchdog for AM broadcast in unmanned tunnels

Show HN: CEL by Example

Show HN: A browser-based search engine with 25ms query latency

Show HN: I'm launching a LPFM radio station

Show HN: Design Token Spec Implemented

Show HN: Trust Protocols for Anthropic/OpenAI/Gemini

Show HN: Growl Owl 2 RL Reasoner

Show HN: Wakapadi – Meet locals and travelers nearby and join free walking tours

Show HN: Nonograms – Friends-only puzzle room with replays and leaderboards

Show HN: Agent Paperclip: A Desktop "Clippy" That Monitors Claude Code/Codex

Show HN: DovahScript – A language for the Thu'um-powered developer

Show HN: Breadboard – A modern HyperCard for building web apps on the canvas

Show HN: I replaced Grafana+Prometheus with a Go binary and SSH for my VPSs

Show HN: Open Notes – Community Notes-style context for Discord

Show HN: Nom – Turn GitHub activity into updates

Show HN: LockFS

Show HN: The Answering Machine – A screenless AI phone for kids with questions

Show HN: Bubble sort on a Turing machine

Show HN: AsteroidOS 2.0 – Nobody asked, we shipped anyway

Show HN: Pg-typesafe – Strongly typed queries for PostgreSQL and TypeScript

Show HN: TUI open-source Python tool for network discovery and port auditing

Show HN: Opaal Visual multi-agent prompt designer for Claude Code and agentic AI

Show HN: I taught LLMs to play Magic: The Gathering against each other

Show HN: I Made a Programming Language with Python Syntax, zero-copy and C-Speed

Show HN: PolyMCP – MCP Tools, Autonomous Agents, and Orchestration

Show HN: X402 Agent Starter Kit: AI agents that pay for their own APIs

Show HN: Codex skills as RE playbooks: unpacking and IOC extraction

Show HN: I built a "Socratic" AI to stop my daughter from copy-pasting homework

Show HN: Clawy, a companion device to track your Claude Code sessions

Show HN: VectorNest responsive web-based SVG editor

Show HN: Formally verified FPGA watchdog for AM broadcast in unmanned tunnels

Show HN: CEL by Example

Show HN: A browser-based search engine with 25ms query latency

Show HN: I'm launching a LPFM radio station

Show HN: Design Token Spec Implemented

Show HN: Trust Protocols for Anthropic/OpenAI/Gemini

Show HN: Growl Owl 2 RL Reasoner

Show HN: Wakapadi – Meet locals and travelers nearby and join free walking tours

Show HN: Nonograms – Friends-only puzzle room with replays and leaderboards

Show HN: Agent Paperclip: A Desktop "Clippy" That Monitors Claude Code/Codex

Show HN: DovahScript – A language for the Thu'um-powered developer

Show HN: Breadboard – A modern HyperCard for building web apps on the canvas

Show HN: I replaced Grafana+Prometheus with a Go binary and SSH for my VPSs

Show HN: Open Notes – Community Notes-style context for Discord

Show HN: Nom – Turn GitHub activity into updates

Show HN: LockFS

Show HN: The Answering Machine – A screenless AI phone for kids with questions

Show HN: Bubble sort on a Turing machine

Show HN: AsteroidOS 2.0 – Nobody asked, we shipped anyway

Show HN: Pg-typesafe – Strongly typed queries for PostgreSQL and TypeScript

Show HN: TUI open-source Python tool for network discovery and port auditing

Show HN: Opaal Visual multi-agent prompt designer for Claude Code and agentic AI

Show HN: I taught LLMs to play Magic: The Gathering against each other

Show HN: I Made a Programming Language with Python Syntax, zero-copy and C-Speed

Show HN: PolyMCP – MCP Tools, Autonomous Agents, and Orchestration

Show HN: X402 Agent Starter Kit: AI agents that pay for their own APIs

Show HN: Codex skills as RE playbooks: unpacking and IOC extraction

Show HN: I built a "Socratic" AI to stop my daughter from copy-pasting homework

Show HN: Clawy, a companion device to track your Claude Code sessions

Show HN: Trust Protocols for Anthropic/OpenAI/Gemini

Comments