I was inspired by Monty and built Zapcode — an open-source TypeScript subset interpreter in Rust, designed for AI agents that write code instead of chaining tool calls.
The problem: AI agents are better when they write code rather than calling tools one by one. Cloudflare (https://blog.cloudflare.com/code-mode/), Anthropic, HuggingFace, and Pydantic are all pushing this pattern. But running AI-generated code is dangerous and slow — Docker adds 200-500ms cold start, V8 isolates bring ~20MB of binary.
Zapcode takes a different approach: a purpose-built bytecode VM that starts in ~2µs, enforces a deny-by-default sandbox at the language level, and can snapshot execution state to <2KB for later resumption.
One LLM call, three tool executions. Intermediate values stay in the code, they never pass back through the model.
Key design decisions:
- TypeScript subset, not full TS. Enough for what LLMs generate (variables, functions, classes, async/await, closures, destructuring, 30+ string methods, 25+ array methods) — not enough to be dangerous (no eval, no import, no require, no fs/net/env).
- Sandbox by construction. The core crate has zero std::fs, zero std::net, zero std::env, zero unsafe blocks. The only way out is external functions you explicitly register — when the VM hits one, it suspends and returns a snapshot. Your code resolves the call, not the guest.
- Snapshot/resume. The VM serializes its entire state to bytes (<2KB typical). You can resume in a different process, on a different machine, days
later. This matters for long-running tools (slow API calls, human-in-the-loop approval).
TheUncharted•1h ago
I was inspired by Monty and built Zapcode — an open-source TypeScript subset interpreter in Rust, designed for AI agents that write code instead of chaining tool calls.
The problem: AI agents are better when they write code rather than calling tools one by one. Cloudflare (https://blog.cloudflare.com/code-mode/), Anthropic, HuggingFace, and Pydantic are all pushing this pattern. But running AI-generated code is dangerous and slow — Docker adds 200-500ms cold start, V8 isolates bring ~20MB of binary.
Zapcode takes a different approach: a purpose-built bytecode VM that starts in ~2µs, enforces a deny-by-default sandbox at the language level, and can snapshot execution state to <2KB for later resumption.
Instead of three LLM round-trips:
LLM → tool A → LLM → tool B → LLM → tool C → LLM
The LLM writes one code block:
const a = await getWeather("Tokyo"); const b = await getWeather("Paris"); const flights = await searchFlights(a.temp < b.temp ? "Tokyo" : "Paris", ...);
One LLM call, three tool executions. Intermediate values stay in the code, they never pass back through the model.
Key design decisions:
- TypeScript subset, not full TS. Enough for what LLMs generate (variables, functions, classes, async/await, closures, destructuring, 30+ string methods, 25+ array methods) — not enough to be dangerous (no eval, no import, no require, no fs/net/env).
- Sandbox by construction. The core crate has zero std::fs, zero std::net, zero std::env, zero unsafe blocks. The only way out is external functions you explicitly register — when the VM hits one, it suspends and returns a snapshot. Your code resolves the call, not the guest.
- Snapshot/resume. The VM serializes its entire state to bytes (<2KB typical). You can resume in a different process, on a different machine, days later. This matters for long-running tools (slow API calls, human-in-the-loop approval).
- Embeddable everywhere. Rust, Node.js (napi), Python (PyO3), WebAssembly. No runtime dependencies.
Inspired by Monty (https://github.com/pydantic/monty) which does the same for Python. Zapcode is Monty for TypeScript.
Benchmarks (full pipeline, no caching):
Simple expression: 2.1µs Function call: 4.6µs Loop (100 iter): 77.8µs Fibonacci(10): 138.4µs Snapshot size: <2KB Cold start: ~2µs
Source: https://github.com/TheUncharted/zapcode