This becomes a problem when teams rely on that metric for capacity planning or optimization decisions, it can make underutilized systems look saturated.
We're releasing an open-source (Apache 2.0) tool, Utilyze, to measure GPU utilization differently. It samples hardware performance counters and reports compute and memory throughput relative to the hardware's theoretical limits. It also estimates an attainable utilization ceiling for a given workload.
GitHub link: https://github.com/systalyze/utilyze
We'd love to hear your thoughts!
xtimecrystal•1h ago
At the moment (v0.1.3) it is more helpful for compute visualization but keeping track of memory usage/processes/temperature/fan speed/etc. prevent this from becoming a full-on drop-in replacement for `nvidia-smi` for me.
ManyaGhobadi•3m ago