frontpage.

I built a governance layer for multi-agent AI coding – lessons after 6 months

2•vincentvandeth•2h ago

Six months ago I started coordinating multiple AI coding agents (Claude Code, Codex CLI, Gemini CLI) across parallel terminals for a production project. The agents were productive, but I had no idea what they were actually deciding or why.

The problem wasn't capability — it was accountability. An agent would make a choice buried in a 50-file commit, and I'd only find out weeks later when something broke. No trace of which agent did what, when, or based on what context.

So I built a governance layer on top. The core idea: every agent decision gets recorded in an append-only receipt ledger (NDJSON). Each receipt links a specific agent action to a git commit, a dispatch ID, and a quality verdict. The orchestrator (T0) reviews receipts and decides what happens next — approve, hold, or redispatch.

Some things I learned: 1. Sub-agents are a black box. I never use them. When a bug surfaces, you can't trace which agent's context was polluted. Instead, I run independent agents in separate terminals with their own context windows, reporting back to T0. 2. Quality gates need to be deterministic, not LLM-based. An automated advisory checks every completion against pre-registered rules (file size limits, test coverage, open blockers). The LLM proposes, the gate validates. No vibes. 3. Context rotation is unsolved by the ecosystem. When an agent fills its context window mid-task, most workflows just fail. I built an automated rotation pipeline using Claude Code hooks — detects context usage, writes a structured handover, clears the window, and resumes. Zero human intervention. 4. The receipt ledger is the most valuable artifact. After 1100+ entries, patterns emerge: which types of tasks fail, which agents struggle with what, where context pollution happens. That data feeds back into dispatch planning. 5. Terminal locking prevents chaos. Each terminal can only work on one dispatch at a time. Sounds obvious, but without it you get overlapping work, merge conflicts, and agents overwriting each other's changes.

The system runs across 4 tmux panes (T0 orchestrator + 3 worker tracks), supports multiple AI providers, and everything is filesystem-based — no database, no cloud dependency. Open-sourced it recently.

Happy to answer questions about the architecture or specific failure modes.

Mac mini will be produced in the US for the first time later this year

Off Grid: On-device AI-web browsing, tools, vision, image gen, voice – 3x faster

Hacking an old Kindle to display bus arrival times

Show HN: Interactive 3D Moon with real NASA data and WebGPU

V1.3.0 Spring CRUD Generator- MariaDB Support + Null Exclusion in REST Responses

Earliest known writing dates back over 40k years

Show HN: I applied Markowitz port. theory to agent teams / proved it in a zkVM

Hegseth warns Anthropic to let the military use company's AI tech as it sees fit

When newspapers cut book coverage, communities lose more than reviews

Meta's Internal Research about harms of social media

Reddit fined more than £14M over age verification checks

GPT-5.3-Codex is now available for all developers

Anthropic's Existential Negotiations with The Pentagon

Show HN: Agently an AI Work OS that turns docs, chats, and tasks into execution

The sitting president is selling watches

Writing about Agentic Engineering Patterns

Show HN: Building to Remember. Using AI to Wrangle My Daily Mess

Show HN: Srclight – Deep code indexing MCP server (FTS5 and Tree-sitter)

Show HN: Prompt → Schema → CRUD API and Admin UI (New Codehooks Template)

I built a tool that scores how replaceable you are in the AI economy

In a replay of 2019, Apple says a single Mac will be manufactured in the US

OpenMedicare – 10 years of Medicare physician data analyzed for fraud patterns

Minimalism

GridCalc: An RPN Spreadsheet for iOS

DJI Romo robovac had security so poor, man remotely accessed them

Laser irradiation method for additive manufacturing of WC–Co cemented carbide

words – my own personal dictionary

Show HN: Writher – offline voice assistant for Windows (Whisper and Ollama)

Show HN: Selfie bodyfat % scan (offline, no server upload)

Can Elon Musk run AI in space?