It falls back to WebGL2 when WebGPU isn't available (Safari, Firefox) and to CPU for Node/headless. Same API everywhere. ~160KB minified, zero dependencies.
The shaders are pre-built WGSL—add, mul, relu, layer norm, attention scores, etc. I added FFT and spectrogram recently, plus conv2d and pooling (CPU for now). Reductions like sum/max use multi-pass since you can't fit 1M elements in one workgroup.
Demos: [phantasm0009.github.io/accel-gpu](https://phantasm0009.github.io/accel-gpu/) — basic math, image processing, heatmap, neural net inference, N-body sim. Would love feedback.
GitHub: [github.com/Phantasm0009/accel-gpu](https://github.com/Phantasm0009/accel-gpu)