Problem (for LLM startups + devs): When an agent “does a task,” it’s surprisingly hard to reliably answer basic questions per run: did it finish, how long did it take, how much I/O happened, did it touch the network, and can this be recorded as one clean row for billing/debugging? Firecracker makes isolation fast, but lifecycle + metrics are easy to get subtly wrong (missing flushes, ambiguous completion, scraping logs).
Solution: fc-metrics boots one microVM per job and emits a single JSON “receipt” at the end: timing, exit code, block read/write bytes from Firecracker metrics, optional net rx/tx (via TAP + MMDS traffic), plus a simple guest “done marker” printed once to the serial console.
What’s included: Makefile targets to fetch Firecracker kernel/rootfs artifacts, patch the guest with a oneshot fc_task.sh, then run and inspect receipts. Docs include a minimal GCE nested virtualization setup.
Repo: https://github.com/joshfischer1108/fc-metrics