Hi HN,
I've been experimenting a lot with autonomous coding agents (like OpenClaw and various CLI tools), but I kept running into two major roadblocks when trying to use them for actual production work:
Security & State: Most agents require running a persistent state on a local machine or VPS, often needing broad file system access. It's a security nightmare for enterprise code.
Chaos: Generalist agents try to do everything at once (planning, coding, testing) in a single context window, which inevitably leads to hallucinations and broken dependencies.
I wanted a system that mirrored how actual engineering teams work, so I built an open-source framework that runs entirely inside GitHub.
How it works: I mapped out a multi-agent system (Scrum Master, Planner, Dev, QA Tester) using Anthropic's Claude, orchestrated entirely through GitHub Actions.
You open an Issue with a feature request. The "Scrum Master" agent breaks it into a Kanban board. The "Planner" writes a feature issue with the implementation spec. Once you approve it, the "Dev" spins up an ephemeral, serverless container, writes the code, and opens a PR. The "QA" agent reviews the PR, runs tests, and leaves comments. The good thing for this architecture is that every task runs in a sterile, single-use runner. No corrupted environments or lingering state. It relies completely on GitHub's permission model. The agents only see what they are explicitly granted access to. Everything happens in standard PR diffs and Issue comments, not a terminal window or chat app.
The open-source template (using GitHub Actions) is here:
https://github.com/plusai-solutions/ai-scrum-master-template
I’m also building a fully managed, serverless GitHub App version with instant dependency hydration (so you don't have to wait 2 minutes for node_modules to install on every run or burn the tokens to set up the environment using AI). The initial version is going to support python and typescript/nodejs env. In the future, I am going to add more environments like the android dev env. If you want to skip the Actions setup and run your agent on a preset environment, you can check out the waitlist here:
https://collo.dev.
I would love to hear HN's thoughts on the architecture, specifically regarding the multi-agent handoffs via GitHub webhook events.