I've read elsewhere that it's 10x less electricity for inference workloads compared to standard GPUs. It is not clear to me, are the model weights built into the silicon (e.g. per model tapeout), or is this a new kind of chip architecture that still has weights in DRAM/SRAM?
slongfield•1h ago
Full disclosure: I work at Etched.
Weights are not burnt into silicon per-model. They're in SRAM/HBM. There's some more info on the website (etched.com) and we'll be sharing more details about model benchmarks this summer.
da-x•1h ago
slongfield•1h ago
Weights are not burnt into silicon per-model. They're in SRAM/HBM. There's some more info on the website (etched.com) and we'll be sharing more details about model benchmarks this summer.