[+114% Attention acceleration]
Any idea how they got +50% FP4 from the same silicon? "Firmware" improvements?
Or did they found a way to disable the INT8 and FP64 units and re-use them e.g. as overspill registers?
Any other ideas why INT8/FP64 is down -97% on the same chip? QA/certification issues?
96% TCO and energy savings for 65 racks eight-way HGX H100 air-cooled versus 1 rack GB200 NLV72 liquid-cooled with equivalent performance on GPT-MoE-1.8T real-time inference throughput.
Big if true. Energy and cooling costs can represent up to 30-40% of the total cost of setting up and running an AI data center.
aurareturn•6h ago
Certainly comparing Blackwell's FP4 performance to H100 FP16, no?
Sweepi•6h ago
In case you you want to compare the complete specs, I would post them here, but since hn supports less formatting than early 2000s bb-forums, check it here: https://www.forum-3dcenter.org/vbulletin/showpost.php?p=1380...