news newest ask show jobs

Open Source @Github

fp.

Open in hackernews

Ask HN: How is GPU power draw measured at scale?

4•anax32•3h ago

How do people measure power usage of GPUs at large (32x) self-hosted setups or small multi-rack setups? I've seen some PDUs which collect and transmit data, but I'm unsure of the processes and if/how people do this on small builds.

Currently, I collect NVML nvmlDeviceGetPowerUsage, polled at 100ms during inference, peak and mean per request, and get this type of data:

model mean-power range (W) spread stdev

qwen3-8b 114.3-121.9 7.6W 1.17

llama-3.1-8b-instruct 104.7-122.1 17.4W 4.29

qwen2.5-1.5b-instruct 53.7-73.0 19.3W 5.23

mistral-7b-instruct-v0.3 96.2-120.0 23.8W 6.01

qwen2.5-7b-instruct 88.7-124.5 35.8W 7.73

gemma-3-1b-it 49.4-56.7 7.3W 2.13

this is per-GPU, single-card data - I don't know whether anything like per-request attribution survives at rack scale, or whether monitoring there happens entirely at the PDU/BMC level instead.

Comments

lemonademan•4m ago

I personally believe once you get beyond a handful of GPUs, people probably end up using both levels of telemetry because they answer different questions. NVML is nice for per-request attribution and understanding model behavior, but I believe PDU/BMC measurements are better suited for actual power draw since they capture everything (CPUs, networking, PSU losses, fans, etc.).

For instance, people running 32+ GPU setups probably correlate timestamps rather than trying to preserve strict per-request attribution at the rack level. This will enable these individuals to have rack/PDU power sampled every second.

Either way, I haven't seen many people publish how they instrument this in practice so take what I wrote with a gran of salt. I simple wanted to share a little bit of what I understand and I hope it helps.

Ask HN: Who remembers Fry's Electronics – the "church" of IT people?

5•netfortius•2h ago•2 comments

Ask HN: Where is the programming profession going?

122•syntaxbush•1d ago•134 comments

Ask HN: How is GPU power draw measured at scale?

4•anax32•3h ago•1 comments

Overfitted a 900KB Transformer to Compress a 100MB CSV into 7MB

91•spidy__•3d ago•56 comments

Ask HN: Norway bans AI in elementary schools

12•mellosty•14h ago•8 comments

Ask HN: Why does every AI demo sound perfect but real world deployment always

7•VaderAi•7h ago•6 comments

Ask HN: How much coding should beginners learn in the AI era?

33•JohnDSDev•1d ago•44 comments

I feel like VSCode is falling apart

6•othmanosx•19h ago•4 comments

Tell HN: OpenAI has started putting ads on paid programs

112•shantnutiwari•1d ago•63 comments

Decoupling Compute and Memory for Async GPUs

8•yiyingzhang•20h ago•2 comments

Ask HN: What surprised you about Estonia e-Residency and running an Estonian OÜ?

79•jvilalta•23h ago•66 comments

Trying to recover from thin content penalty from Google

5•anitroves•17h ago•4 comments

My website gets more attacks than human visitors

5•tommy2970•19h ago•4 comments

Ask HN: Quickbooks Alternative?

4•bix6•19h ago•1 comments

Google AI overview for "keynesian economics" is written in Korean

4•something765478•20h ago•3 comments

Ask HN: Do you thank your agents when they did a good job?

6•ex-aws-dude•22h ago•10 comments

Ask HN: What home printer do you use/recommend?

18•niyazpk•2d ago•22 comments

As; HN: I was curious why MTP affects PP TPS in llama.cpp. My PoC recovers it?

2•i_am_rocoe•23h ago•1 comments

Ask HN: What are the hardest problems AWS Lambda MicroVMs can solve now?

6•iaziz786•1d ago•2 comments

Ask HN: Will programmers write more efficient code during the memory shortage?

153•amichail•6d ago•246 comments

Got access to Gemini's actual thinking

4•StizzurpXDD•1d ago•0 comments

How to find AI-conservative companies to work for?

20•tossitawayplz•2d ago•12 comments

Ask HN: Anthropic banned me from using Claude Code and I don't know what to do

81•ayi•3d ago•93 comments

Ask HN: Yahoo deleted all my emails. Now what?

15•neya•2d ago•13 comments

Ask HN: Is anyone using the A2A protocol?

96•asim•1w ago•45 comments

Ask HN: What tools are you using for AI-assisted code review?

25•agos•1w ago•30 comments

Ask HN: Am I missing something with AI

15•vasko•2d ago•23 comments

Ask HN: What is one thing about AI that annoys you the most?

4•akashwadhwani35•18h ago•6 comments

Ask HN: Why don't LLM harnesses enable/expose custom middleware hooks?

8•fur-tea-laser•1d ago•8 comments

Ask HN: I miss old days of blogging without promotions

8•throwaw12•1d ago•12 comments