Show HN: I built a firewall for agents because prompt engineering isn't security

7•yaront111•2w ago

Hi HN, I’m the creator of Cordum.

I’ve been working in DevOps and infrastructure for years (currently in the fintech/security space), and as I started playing with AI agents, I noticed a scary pattern. Most "safety" mechanisms rely on system prompts ("Please don't do X") or flimsy Python logic inside the agent itself.

If we treat agents as autonomous employees, giving them root access and hoping they listen to instructions felt insane to me. I wanted a way to enforce hard constraints that the LLM cannot override, no matter how "jailbroken" it gets.

So I built Cordum. It’s an open-source "Safety Kernel" that sits between the LLM's intent and the actual execution.

The architecture is designed to be language-agnostic: 1. *Control Plane (Go/NATS/Redis):* Manages the state and policy. 2. *The Protocol (CAP v2):* A wire format that defines jobs, steps, and results. 3. *Workers:* You can write your agent in Python (using Pydantic), Node, or Go, and they all connect to the same safety mesh.

Key features I focused on: - *The "Kill Switch":* Ability to revoke an agent's permissions instantly via the message bus, without killing the host server. - *Audit Logs:* Every intent and action is recorded (critical for when things go wrong). - *Policy Enforcement:* Blocking actions based on metadata (e.g., "Review required for any transfer > $50") before they reach the worker.

It’s still early days (v0.x), but I’d love to hear your thoughts on the architecture. Is a separate control plane overkill, or is this where agentic infrastructure is heading?

Repo: https://github.com/cordum-io/cordum Docs: [Link to your docs if you have them]

Thanks!

Comments

hackerunewz•2w ago

Nice job, but is'nt it a bit overkill?

yaront111•2w ago

It is overkill for a demo. But for my production environment, I need an external safety layer. I can't rely on 'prompt engineering' when real data is at stake.

amadeuswoo•2w ago

Interesting architecture. Im curious about the workflow when an agent hits a denied action, does it get a structured rejection it can reason about and try an alternative, or does it just fail? Wondering how the feedback loop works between safety kernel and the LLM's planning

yaront111•2w ago

Great question. This is actually a core design principle of the Cordum Agent Protocol (CAP).

It’s definitely a *structured rejection*, not a silent fail. Since the LLM needs to "know" it was blocked to adjust its plan, the kernel returns a standard error payload (e.g., `PolicyViolationError`) with context.

The flow looks like this: 1. *Agent:* Sends intent "Delete production DB". 2. *Kernel:* Checks policy -> DENY. 3. *Kernel:* Returns a structured result: `{ "status": "blocked", "reason": "destructive_action_limit", "message": "Deletion requires human approval" }`. 4. *Agent (LLM):* Receives this as an observation. 5. *Agent (Re-planning):* "Oh, I can't delete it. I will generate a slack message to the admin asking for approval instead."

This feedback loop turns safety from a "blocker" into a constraint that the agent can reason around, which is critical for autonomous recovery.

exordex•2w ago

I built formal testing for AI agents, runs on the cli, free version launching soon - includes MCP security tests and chaos engineering features: https://exordex.com/waitlist

yaront111•2w ago

Exordex is a great tool for the CI/CD pipeline to test agents. Cordum is the Runtime Kernel that enforces those policies in production. Ideally? You use Exordex to test that your agent works, and Cordum to guarantee it stays safe.

TeamCommet1•2w ago

Regarding the separate control plane: I don't think it's overkill if you're aiming for multi-agent orchestration. A safety mesh needs to be centralized to maintain a global state of permissions. If you bake the safety logic into each worker, you end up with the same "flimsy logic" problem you're trying to solve.

Curious, how are you handling latency in the CAP v2 protocol when the control plane has to intercept every intent before execution?

The Cascading Effects of Repackaged APIs [pdf]

Lightweight and extensible compatibility layer between dataframe libraries

Haskell for all: Beyond agentic coding

Dorsey's Block cutting up to 10% of staff

Show HN: Freenet Lives – Real-Time Decentralized Apps at Scale [video]

In the AI age, 'slow and steady' doesn't win

Administration won't let student deported to Honduras return

How were the NIST ECDSA curve parameters generated? (2023)

AI, networks and Mechanical Turks (2025)

Goto Considered Awesome [video]

Show HN: I Built a Free AI LinkedIn Carousel Generator

Implementing Auto Tiling with Just 5 Tiles

Open Challange (Get all Universities involved

Apple Tried to Tamper Proof AirTag 2 Speakers – I Broke It [video]

Show HN: Isolating AI-generated code from human code | Vibe as a Code

Show HN: More beautiful and usable Hacker News

Toledo Derailment Rescue [video]

War Department Cuts Ties with Harvard University

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

A Bid-Based NFT Advertising Grid

AI readability score for your documentation

NASA Study: Non-Biologic Processes Don't Explain Mars Organics

I inhaled traffic fumes to find out where air pollution goes in my body

X said it would give $1M to a user who had previously shared racist posts

155M US land parcel boundaries

Private Inference

Font Rendering from First Principles

Show HN: Seedance 2.0 AI video generator for creators and ecommerce

Wally: A fun, reliable voice assistant in the shape of a penguin

Rewriting Pycparser with the Help of an LLM

The Cascading Effects of Repackaged APIs [pdf]

Lightweight and extensible compatibility layer between dataframe libraries

Haskell for all: Beyond agentic coding

Dorsey's Block cutting up to 10% of staff

Show HN: Freenet Lives – Real-Time Decentralized Apps at Scale [video]

In the AI age, 'slow and steady' doesn't win

Administration won't let student deported to Honduras return

How were the NIST ECDSA curve parameters generated? (2023)

AI, networks and Mechanical Turks (2025)

Goto Considered Awesome [video]

Show HN: I Built a Free AI LinkedIn Carousel Generator

Implementing Auto Tiling with Just 5 Tiles

Open Challange (Get all Universities involved

Apple Tried to Tamper Proof AirTag 2 Speakers – I Broke It [video]

Show HN: Isolating AI-generated code from human code | Vibe as a Code

Show HN: More beautiful and usable Hacker News

Toledo Derailment Rescue [video]

War Department Cuts Ties with Harvard University

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

A Bid-Based NFT Advertising Grid

AI readability score for your documentation

NASA Study: Non-Biologic Processes Don't Explain Mars Organics

I inhaled traffic fumes to find out where air pollution goes in my body

X said it would give $1M to a user who had previously shared racist posts

155M US land parcel boundaries

Private Inference

Font Rendering from First Principles

Show HN: Seedance 2.0 AI video generator for creators and ecommerce

Wally: A fun, reliable voice assistant in the shape of a penguin

Rewriting Pycparser with the Help of an LLM

Show HN: I built a firewall for agents because prompt engineering isn't security

Comments