I’ve been interested in volume rendering for many years, and once WebGPU became widely available, I wanted to explore how multi-GB volumetric datasets can be streamed and rendered interactively in the browser — without downloading the entire volume up front.
Kiln is a virtualized volume renderer for WebGPU with out-of-core streaming and a fixed VRAM budget.
Many WebGPU volume rendering examples assume the data is compact enough to be downloaded in full before rendering begins and kept fully resident during interaction.
Kiln instead streams multi-GB datasets over HTTP and renders them using a fixed-size GPU page cache and virtual texture indirection. The renderer is fully compute-shader-based and targets WebGPU directly (no WebGL fallback).
Kiln expects the volume to be available as multi-resolution bricks (either preprocessed into its custom format or provided via a chunked format such as OME-Zarr). At runtime, only bricks relevant to the current view are fetched via HTTP range requests, decompressed asynchronously in worker threads, and uploaded to VRAM, while old entries are evicted via LRU.
The logical brick layout is decoupled from physical GPU memory. A compact indirection texture maps logical brick coordinates to atlas slots, and each ray sample resolves this indirection before performing the density lookup inside the brick.
This architecture allows large datasets to become interactive quickly, with a short time-to-first-render and a constant, modest VRAM footprint.
The renderer is format-agnostic. Currently there are two data providers: a simple custom sharded binary format and an experimental OME-Zarr integration, both using the same rendering core.
Happy to discuss architectural decisions or performance trade-offs.