I think it's really interesting to look at how the GPU market is evolving. TensorPool [1], as an example, who I'm not affiliated with, is a startup that is looking at lowering GPU inference costs.
I think there was some research in relation to energy consumption a couple of years back [2], but I've not noticed anything more recently, since, having briefly searched.
I'm really interested to hear the thoughts of the community in terms of energy costs and provisioning spend w.r.t. increasing usage over time.
[1] https://tensorpool.dev/ [2] GPT-4 energy consumption: https://www.sciencedirect.com/science/article/pii/S2542435123003653
westurner•4h ago
Though, some GPUs have a TPU. For example Nvidia DLSS3 is a TPU.
"A PCIe Coral TPU Finally Works on Raspberry Pi 5" (2023) https://news.ycombinator.com/item?id=38310063
"ARM adds neural accelerators to GPUs" (2025) https://news.ycombinator.com/item?id=44919793
From "The von Neumann bottleneck is impeding AI computing?" (2025) https://news.ycombinator.com/item?id=45398473 :
> How does Cerebras WSE-3 with 44GB of 'L2' on-chip SRAM compare to Google's TPUs, Tesla's TPUs, NorthPole, Groq LPU, Tenstorrent's, and AMD's NPU designs?
Tensor Processing Unit: https://en.wikipedia.org/wiki/Tensor_Processing_Unit
..
- "Ask HN: Are you paying electricity bills for your service?" (2024) https://news.ycombinator.com/item?id=42454547 re: Zero Water datacenters
- "Show HN: LangSpend – Track LLM costs by feature and customer (OpenAI/Anthropic)" (2025-10) https://news.ycombinator.com/item?id=45771618
westurner•3h ago