There have also been proposals to use flash memory in inference accelerators instead of DRAM. You can make high bandwidth flash using the same stacking technique used for HBM DRAM.
It is obviously unsuitable for training because of limited write cycles. But the read bandwidth is decent, and the density/$ is much better.
dlcarrier•1h ago
I'm convinced that if Intel Optane had hung around a little longer, it would now be one of the highest growth products at Intel.
Legend2440•1h ago
It is obviously unsuitable for training because of limited write cycles. But the read bandwidth is decent, and the density/$ is much better.