Show HN: Vigil – Zero-dependency safety guardrails for AI agent tool calls

2•HexitLabs•2h ago

We run 15 AI agents on a production server with full shell access. One of them tried to rm -rf a directory it shouldn't have touched. Another started curling cloud metadata endpoints. We wrote some hardcoded rules to catch the obvious stuff, then realized we were building the same safety layer everyone else will need too. So we extracted it into a library.

Vigil is a deterministic rule engine that inspects AI agent tool calls before they execute. 22 rules across 8 threat categories: destructive shell commands, SSRF, path traversal, SQL injection, data exfiltration, prompt injection, encoded payloads, and credential exposure. It's not an LLM wrapper — we don't trust an LLM to guard another LLM. Pure pattern matching, zero dependencies, <2ms per check, works completely offline.

npm install vigil-agent-safety

import { checkAction } from 'vigil-agent-safety'; const result = checkAction({ agent: 'my-agent', tool: 'exec', params: { command: 'rm -rf /' }, }); // result.decision → "BLOCK" // result.reason → "Destructive command pattern" // result.latencyMs → 0.3

It plugs into MCP servers, LangChain tool chains, Express middleware, or anything else. MIT licensed, no API keys, no network calls, no telemetry.

This is v0.1 — probably too aggressive for some use cases. Next up is a YAML policy engine (v0.2) and an MCP proxy. We'd love feedback on the rule set, false positive experiences, and threat categories we're missing.

GitHub: https://github.com/hexitlabs/vigil

Comments

HexitLabs•2h ago

Author here, happy to answer any questions.

Some context on why we built this: you might have seen the post earlier this week about someone building a file recovery tool after Claude Code rm -rf'd their Obsidian vault through a symlink. We had similar near-misses running our own agent swarm, agents curling cloud metadata endpoints, attempting path traversal, executing destructive commands during "cleanup" steps. We kept adding one-off guards and eventually realized this should be a proper library.

The main design choice was making it deterministic rather than using an LLM to review tool calls. An LLM guarding another LLM felt like asking the fox to guard the henhouse. Pattern matching is boring, but it's fast, predictable, and works offline.

Happy to hear about false positives, missing threat categories, or use cases where the rule set is too aggressive. That's the main thing we want to calibrate for v0.2.

Show HN: A VCluster in Docker with Terraform and Istio

Show HN: Colored Title Bar – unique colors per VS Code workspace

Echoslate – Offline todo and Kanban tracker for programmers (MIT, .NET 8)

Show HN: Bridge your Claude/OpenAI subs into a team API with per-key cost caps

FounderSpace – AI-guided startup validation in 30 minutes

P5.lcd

US and Israel carrying out strikes against Iran

The March of Nines

Song about afterlife from AI perspective [video]

Show HN: TapPause – A dead-simple timer for intentional breaks

I built a startup game because I wanted one to play

U.S. and Israel Conduct Strikes on Iran

Japan aims for world first in space-based solar power

Simulation for Agentic Evaluation

Israel says it has attacked Iran, declares state of emergency

Show HN: PicShift – Convert images in the browser using WebAssembly

I Just Cancelled My ChatGPT Pro Plan

Trump directs US agencies to toss Anthropic's AI

Huk vs. Naniwa Starcraft [video]

Show HN: I built GeoQuests where people can request photos of a place

Israel launches attack on Iran, defence minister says

Israel launches strike against Iran, declares state of emergency across country

Ask HN: AI driven "legacy" Debug and Monitoring Tools

AI Mistakes Are Infuriating Gamers as Developers Seek Savings

AI Doomer Doublethink: The Orwellian Worldviews in the AGI Discourse

EUrouter – Integrate the latest AI models, without sending data outside the EU

What's cooking in git.git (Feb 2026, #11)

Poker4U – Educational game theory and better life decisions

Varlock: .env Files Built for Sharing

Saving No One