Ouch found the killer it takes up 0.1 mm^2 in area. That's a show stopper. Hopefully they can scale it down or use it for server infra.
> bitcell achieves at least 10 GHz read, write, and compute operations entirely in the optical domain
> Validated on GlobalFoundries' 45SPCLO node
> X-pSRAM consumed 13.2 fJ energy per bit for XOR computation
Don't only think about area.
It's only when you expect data to be able to cross a chip in a single clock cycle that you need to slow down to the 5 Ghz or so that CPUs run into trouble exceeding.
The idea of RAM itself is the bottleneck. If you can load data in one end of a process, and get results out the other end, without ever touching RAM, you can do wonders.
(I did a google search on the acknowledged grant in the paper, no connection)
[0] https://sam.gov/opp/e0fb2b2466cd470481b0ca5cab3d210d/view
Scene_Cast2•5h ago
I understand that you can get highly power efficient XORs, for example. But if we go down this path, would they help with a matrix multiply? Or the bias term of a FFN? Would there be any improvement (i.e. is there anything to offload) in regular business logic? Should I think of it as a more efficient but highly limited DSP? Or a fixed function accelerator replacement (e.g. "we want to encrypt this segment of memory")
roflmaostc•5h ago
For example, in this work Lin, Z., Shastri, B.J., Yu, S. et al. 120 GOPS Photonic tensor core in thin-film lithium niobate for inference and in situ training. Nat Commun 15, 9081 (2024). https://doi.org/10.1038/s41467-024-53261-x
they achieve a "weight update speed of 60Ghz" which is much faster than the average ~3-4Ghz CPU.
GloamingNiblets•4h ago
It's still very niche but could offer enormous power savings for ML inference.
larodi•3h ago
IBM experimenting in this direction or at least they claim to here https://www.ibm.com/think/topics/neuromorphic-computing
there is another CPU which was recently featured which has again a lattice which is sort of FPGA but very fast, where different modules are loaded with some tasks, and each marble pumps data to some other, where the orchestrator decides how and what goes in each of these.
oneseven•2h ago
https://news.ycombinator.com/item?id=44685050
phkahler•49m ago
woodrowbarlow•3h ago