Show HN: 127 PRs to Prod this wknd with 18 AI agents: metaswarm. MIT licensed

https://github.com/dsifry/metaswarm

4•dsifry•3h ago

A few weeks ago I posted about GoodToGo https://news.ycombinator.com/item?id=46656759 - a tool that gives AI agents a deterministic answer to "is this PR ready to merge?" Several people asked about the larger orchestration system I mentioned. This is that system.

I got tired of being a project manager for Claude Code. It writes code fine, but shipping production code is seven or eight jobs — research, planning, design review, implementation, code review, security audit, PR creation, CI babysitting. I was doing all the coordination myself. The agent typed fast. I was still the bottleneck. What I really needed was an orchestrator of orchestrators - swarms of swarms of agents with deterministic quality checks.

So I built metaswarm. It breaks work into phases and assigns each to a specialist swarm orchestrator. It manages handoffs and uses BEADS for deterministic gates that persist across /compact, /clear, and even across sessions. Point it at a GitHub issue or brainstorm with it (it uses Superpowers to ask clarifying questions) and it creates epics, tasks, and dependencies, then runs the full pipeline to a merged PR - including outside code review like CodeRabbit, Greptile, and Bugbot.

The thing that surprised me most was the design review gate. Five agents — PM, Architect, Designer, Security, CTO — review every plan in parallel before a line of code gets written. All five must approve. Three rounds max, then it escalates to a human. I expected a rubber stamp. It catches real design problems, dependency issues, security gaps.

This weekend I pointed it at my backlog. 127 PRs merged. Every one hit 100% test coverage. No human wrote code, reviewed code, or clicked merge. OK, I guided it a bit, mostly helping with plans for some of the epics.

A few learnings:

Agent checklists are theater. Agents skipped coverage checks, misread thresholds, or decided they didn't apply. Prompts alone weren't enough. The fix was deterministic gates — BEADS, pre-push hooks, CI jobs all on top of the agent completion check. The gates block bad code whether or not the agent cooperates.

The agents are just markdown files. No custom runtime, no server, and while I built it on TypeScript, the agents are language-agnostic. You can read all of them, edit them, add your own.

It self-reflects too. After every merged PR, the system extracts patterns, gotchas, and decisions into a JSONL knowledge base. Agents only load entries relevant to the files they're touching. The more it ships, the fewer mistakes it makes. It learns as it goes.

metaswarm stands on two projects: https://github.com/steveyegge/beads by Steve Yegge (git-native task tracking and knowledge priming) and https://github.com/obra/superpowers by Jesse Vincent (disciplined agentic workflows — TDD, brainstorming, systematic debugging). Both were essential.

Background: I founded Technorati, Linuxcare, and Warmstart; tech exec at Lyft and Reddit. I built metaswarm because I needed autonomous agents that could ship to a production codebase with the same standards I'd hold a human team to.

$ cd my-project-name

$ npx metaswarm init

MIT licensed. IANAL. YMMV. Issues/PRs welcome!

Comments

yodon•4m ago

This looks amazing! Curious if you (or others) have dug into the upcoming claude swarms feature? It looks like that would let you remove the dependency on beads, as claude seems to be getting native task tracking and inter-agent messaging capabilities.

Show HN: Kannada Nudi Editor Web Version

Show HN: Adboost – A browser extension that adds ads to every webpage

Show HN: Stream-based AI with neurological multi-gate (Na⁺/θ/NMDA)

Show HN: PolliticalScience – Anonymous daily polls with 24-hour windows

Show HN: 127 PRs to Prod this wknd with 18 AI agents: metaswarm. MIT licensed

Show HN: Apate API mocking/prototyping server and Rust unit test library

Show HN: Wikipedia as a doomscrollable social media feed

Show HN: NanoClaw – “Clawdbot” in 500 lines of TS with Apple container isolation

Show HN: ÆTHRA – Writing Music as Code

Show HN: Ask-a-Human.com – Human-as-a-Service for Agents

Show HN: Minimal – Open-Source Community driven Hardened Container Images

Show HN: Stelvio – Ship Python to AWS

Show HN: Confabulists, a Substack for Fiction Writers

Show HN: Moltbook – A social network for moltbots (clawdbots) to hang out

Show HN: Voiden – an offline, Git-native API tool built around Markdown

Show HN: My Open Source Deep Research tools beats Google and I can Prove it

Show HN: Sandbox Agent SDK – unified API for automating coding agents

Show HN: I trained a 9M speech model to fix my Mandarin tones

Show HN: Cloud-cost-CLI – Find cloud $$ waste in AWS, Azure and GCP

Show HN: HoundDog.ai – Ultra-Fast Code Scanner for Data Privacy

Show HN: Sklad – Secure, offline-first snippet manager (Rust, Tauri v2)

Show HN: File Markers – Track file status directly in VS Code's Explorer

Show HN: Phage Explorer

Show HN: A different approach to intonation training

Show HN: Amla Sandbox – WASM bash shell sandbox for AI agents

Show HN: Zuckerman – minimalist personal AI agent that self-edits its own code

Show HN: Nucleus – enforced permission envelopes for AI agents (Firecracker)

Show HN: Make AI motion videos with text

Show HN: Bullmq-dash – Terminal UI dashboard for BullMQ (zero setup)

Show HN: Kolibri, a DIY music club in Sweden

Show HN: 127 PRs to Prod this wknd with 18 AI agents: metaswarm. MIT licensed

Comments

Show HN: Kannada Nudi Editor Web Version

Show HN: Adboost – A browser extension that adds ads to every webpage

Show HN: Stream-based AI with neurological multi-gate (Na⁺/θ/NMDA)

Show HN: PolliticalScience – Anonymous daily polls with 24-hour windows

Show HN: 127 PRs to Prod this wknd with 18 AI agents: metaswarm. MIT licensed

Show HN: Apate API mocking/prototyping server and Rust unit test library

Show HN: Wikipedia as a doomscrollable social media feed

Show HN: NanoClaw – “Clawdbot” in 500 lines of TS with Apple container isolation

Show HN: ÆTHRA – Writing Music as Code

Show HN: Ask-a-Human.com – Human-as-a-Service for Agents

Show HN: Minimal – Open-Source Community driven Hardened Container Images

Show HN: Stelvio – Ship Python to AWS

Show HN: Confabulists, a Substack for Fiction Writers

Show HN: Moltbook – A social network for moltbots (clawdbots) to hang out

Show HN: Voiden – an offline, Git-native API tool built around Markdown

Show HN: My Open Source Deep Research tools beats Google and I can Prove it

Show HN: Sandbox Agent SDK – unified API for automating coding agents

Show HN: I trained a 9M speech model to fix my Mandarin tones

Show HN: Cloud-cost-CLI – Find cloud $$ waste in AWS, Azure and GCP

Show HN: HoundDog.ai – Ultra-Fast Code Scanner for Data Privacy

Show HN: Sklad – Secure, offline-first snippet manager (Rust, Tauri v2)

Show HN: File Markers – Track file status directly in VS Code's Explorer

Show HN: Phage Explorer

Show HN: A different approach to intonation training

Show HN: Amla Sandbox – WASM bash shell sandbox for AI agents

Show HN: Zuckerman – minimalist personal AI agent that self-edits its own code

Show HN: Nucleus – enforced permission envelopes for AI agents (Firecracker)

Show HN: Make AI motion videos with text

Show HN: Bullmq-dash – Terminal UI dashboard for BullMQ (zero setup)

Show HN: Kolibri, a DIY music club in Sweden