Our team at CircleCI built Chunk sidecars after repeatedly running into the same issue internally: by the time our CI catches a failure, the agent has already moved on and most of the useful context is gone.
The basic idea of Chunk sidecars is to move fast lightweight validation into the inner development loop.
Chunk sidecars runs scoped “microbuilds” inside a lightweight microVM that mirrors your CI environment. It tries to auto-detect your stack and test commands, syncs changes from the agent session, and runs validations before commit/push.
A few implementation details that might be interesting:
validation hooks trigger automatically during agent stop/evaluation events
warm snapshots keep startup times low
validations run against environments matching the CI stack instead of local machine state
microbuilds only run the relevant slice instead of the entire pipeline
In our own experiments we measured:
~27 second average microbuild compute
~5 minutes total billable compute for equivalent full CI runs
3x–5x lower token usage in retry loops
The compute comparison is billable compute vs billable compute, not wall clock time. Full CI pipelines were parallelized.
The 27s is with warm snapshots — first-time setup takes about 15 minutes. We tested this on our own pipeline, not a large corpus. Larger repos with heavier deps will vary.
Under the hood it's currently Firecracker microVMs, running on E2B infrastructure. Current spec: 4 CPU, 8GB RAM (comparable to a Docker large). Things can change in the future depending on feedback and learnings.
Short demo video (YT) here: https://circle.ci/4dq9fph
Blog post: https://circleci.com/blog/chunk-sidecars/
Chunk CLI GitHub repo: https://github.com/CircleCI-Public/chunk-cli
This works with any CircleCI account (including the free one), and integrates with Claude Code, Codex, Cursor, or your own agents. The project is open source and also has features that work without CircleCI connected. Simply install the Chunk CLI and run "chunk init" and the sidecar auto-detects your stack and test commands.
Would love all feedback, especially from people already experimenting with agentic workflows. We're especially curious whether others are seeing the same CI failure rate pattern and "widening gap" between inner and outer dev/SDLC loop with agent-generated code?
grimleech•42m ago
Edit: nevermind, working now
olafmol•34m ago
Tnx for checking, and reconfirming!