frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Go 1.22, SQLite, and Next.js: The "Boring" Back End

https://mohammedeabdelaziz.github.io/articles/go-next-pt-2
1•mohammede•3m ago•0 comments

Laibach the Whistleblowers [video]

https://www.youtube.com/watch?v=c6Mx2mxpaCY
1•KnuthIsGod•5m ago•1 comments

I replaced the front page with AI slop and honestly it's an improvement

https://slop-news.pages.dev/slop-news
1•keepamovin•9m ago•1 comments

Economists vs. Technologists on AI

https://ideasindevelopment.substack.com/p/economists-vs-technologists-on-ai
1•econlmics•11m ago•0 comments

Life at the Edge

https://asadk.com/p/edge
1•tosh•17m ago•0 comments

RISC-V Vector Primer

https://github.com/simplex-micro/riscv-vector-primer/blob/main/index.md
2•oxxoxoxooo•21m ago•1 comments

Show HN: Invoxo – Invoicing with automatic EU VAT for cross-border services

2•InvoxoEU•21m ago•0 comments

A Tale of Two Standards, POSIX and Win32 (2005)

https://www.samba.org/samba/news/articles/low_point/tale_two_stds_os2.html
2•goranmoomin•25m ago•0 comments

Ask HN: Is the Downfall of SaaS Started?

3•throwaw12•26m ago•0 comments

Flirt: The Native Backend

https://blog.buenzli.dev/flirt-native-backend/
2•senekor•28m ago•0 comments

OpenAI's Latest Platform Targets Enterprise Customers

https://aibusiness.com/agentic-ai/openai-s-latest-platform-targets-enterprise-customers
1•myk-e•30m ago•0 comments

Goldman Sachs taps Anthropic's Claude to automate accounting, compliance roles

https://www.cnbc.com/2026/02/06/anthropic-goldman-sachs-ai-model-accounting.html
2•myk-e•33m ago•4 comments

Ai.com bought by Crypto.com founder for $70M in biggest-ever website name deal

https://www.ft.com/content/83488628-8dfd-4060-a7b0-71b1bb012785
1•1vuio0pswjnm7•34m ago•1 comments

Big Tech's AI Push Is Costing More Than the Moon Landing

https://www.wsj.com/tech/ai/ai-spending-tech-companies-compared-02b90046
4•1vuio0pswjnm7•36m ago•0 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
2•1vuio0pswjnm7•37m ago•0 comments

Suno, AI Music, and the Bad Future [video]

https://www.youtube.com/watch?v=U8dcFhF0Dlk
1•askl•39m ago•2 comments

Ask HN: How are researchers using AlphaFold in 2026?

1•jocho12•42m ago•0 comments

Running the "Reflections on Trusting Trust" Compiler

https://spawn-queue.acm.org/doi/10.1145/3786614
1•devooops•47m ago•0 comments

Watermark API – $0.01/image, 10x cheaper than Cloudinary

https://api-production-caa8.up.railway.app/docs
1•lembergs•49m ago•1 comments

Now send your marketing campaigns directly from ChatGPT

https://www.mail-o-mail.com/
1•avallark•52m ago•1 comments

Queueing Theory v2: DORA metrics, queue-of-queues, chi-alpha-beta-sigma notation

https://github.com/joelparkerhenderson/queueing-theory
1•jph•1h ago•0 comments

Show HN: Hibana – choreography-first protocol safety for Rust

https://hibanaworks.dev/
5•o8vm•1h ago•1 comments

Haniri: A live autonomous world where AI agents survive or collapse

https://www.haniri.com
1•donangrey•1h ago•1 comments

GPT-5.3-Codex System Card [pdf]

https://cdn.openai.com/pdf/23eca107-a9b1-4d2c-b156-7deb4fbc697c/GPT-5-3-Codex-System-Card-02.pdf
1•tosh•1h ago•0 comments

Atlas: Manage your database schema as code

https://github.com/ariga/atlas
1•quectophoton•1h ago•0 comments

Geist Pixel

https://vercel.com/blog/introducing-geist-pixel
2•helloplanets•1h ago•0 comments

Show HN: MCP to get latest dependency package and tool versions

https://github.com/MShekow/package-version-check-mcp
1•mshekow•1h ago•0 comments

The better you get at something, the harder it becomes to do

https://seekingtrust.substack.com/p/improving-at-writing-made-me-almost
2•FinnLobsien•1h ago•0 comments

Show HN: WP Float – Archive WordPress blogs to free static hosting

https://wpfloat.netlify.app/
1•zizoulegrande•1h ago•0 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
1•melvinzammit•1h ago•0 comments
Open in hackernews

GPU Hot: Dashboard for monitoring NVIDIA GPUs on remote servers

https://github.com/psalias2006/gpu-hot
83•github-trending•4mo ago

Comments

github-trending•4mo ago
Hi everyone, I just built a GPU dashboard to check the utilization on NVIDIA cards directly in your browser. It also works with multiple GPUs. The idea is to have real-time metrics from a remote GPU server instead of running nvidia-smi. Let me know if you try it out!
ohong•3mo ago
Hey I rly like your project! And kinda amused by the comments here... Would love to chat more about it if you're down. My contact's in my profile :)
heipei•4mo ago
Obligatory reminder that "GPU utilisation" as a percentage is meaningless metric and does not tell you how well your GPU is utilised.

Does not change the usefulness of this dashboard, just wanted to point it out.

yfontana•4mo ago
Properly measuring "GPU load" is something I've been wondering about, as an architect who's had to deploy ML/DL models but is still relatively new at it. With CPU workloads you can generally tell from %CPU, %Mem and IOs how much load your system is under. But with GPU I'm not sure how you can tell, other than by just measuring your model execution times. I find it makes it hard to get an idea whether upgrading to a stronger GPU would help and by how much. Are there established ways of doing this?
jplusequalt•4mo ago
CUDA toolkit comes with an occupancy calculator that can help you determine based on your kernel launch parameters how busy your GPU will potentially be.

For more information: https://docs.nvidia.com/cuda/cuda-c-programming-guide/#multi...

sailingparrot•4mo ago
For kernel-level performance tuning you can use the occupancy calculator as pointed out by jplusqualt or you can profile your kernel with Nsight compute which will give you a ton of info.

But for model-wide performance, you basically have to come up with your own calculation to estimate the FLOPs required by your model and based on that figure out how well your model is maxing out the GPU capabilities (MFU/HFU).

Here is a more in-depth example on how you might do this: https://github.com/stas00/ml-engineering/tree/master/trainin...

villgax•4mo ago
you need to profile them, nsight is one even torch does flamegraphs
hatthew•4mo ago
It's harder than measuring CPU load, and depends a lot on context. For example, often 90% of a GPU's available flops are exclusively for low-precision matrix multiply-add operations. If you're doing full precision multiply-add operations at full speed, do you count that as 10% or 100% load? If you're doing lots of small operations and your warps are only 50% full, do you count that as 50% or 100% load? Unfortunately, there isn't really a shortcut to understanding how a GPU works and knowing how you're using it.
Scene_Cast2•4mo ago
@dang sorry for the meta-comment, but why is yfontana's comment dead? I found it pretty insightful.
kergonath•4mo ago
FYI, adding @ before a user name does nothing besides looking terrible and AFAIK dang does not get a notification when he’s mentioned. If you want to contact him, the best way is to send an email to hn@ycombinator.com .
yfontana•4mo ago
I think I was shadow-banned because my very first comment on the site was slightly snarky, and have now been unbanned.
huevosabio•4mo ago
how so?
sailingparrot•4mo ago
"Utilization" tells you the percentage of your GPU's SM that currently have at least one thread assigned to them.

It does not at all take into count how much that thread is actually using the core to it's capacity.

So if e.g. your thread is locked waiting on some data from another GPU (NCCL) and actually doing nothing, it will still show 100% utilisation. A good way to realize that is when a NCCL call timeout after 30 minutes for some reason, but you can see all your GPUs (except the one that cause the failure) were at 100% util, even though they clearly did nothing but wait.

Another example are operation with low compute intensity: Say you want to add 1 to every element in a very large tensor, you effectively have to transfer every element (let's say FP8, so 1 byte) from the HBM to the l2 memory, which is very slow operation, to then simply do an add, which is extremely fast. It takes about ~1000x more time to move that byte to L2 than it takes to actually do the add, so in effect your "true" utilization is ~0.2%, but nvidia-smi (and this tool) will show 100% for the entire duration of that add.

Sadly there isn't a great general way to monitor "true" utilization during training, generally you have to come up with an estimate of how many flops your model requires per pass, look at the time it takes to do said pass, and compare the flops/sec you get to Nvidia's spec sheet. If you get around 60% of theoretical flops for a typical transformer LLM training you are basically at max utilization.

aprdm•4mo ago
What about energy consumption as a proxy for it ?
villgax•4mo ago
not a good estimator but still roughly good, ambient temps/neighboring cards alone might influence this more than workloads
sailingparrot•4mo ago
Definitely a better high level metric than nvidia-smi, and probably fine if you just want to get a very coarse idea of whether or not your are using the GPUs reasonably at all.

But when you get to the point where you care about a few percentage points of utilisation it's just not reliable enough as many things can impact energy consumption both ways. E.g. had a case were the GPU cluster we were using wasn't being cooled well enough, so you would gradually see power draw getting lower and lower as the GPUs were throttling themselves to not overheat.

You can also find cases were energy consumption is high but MFU/HFU isn't, like memory intensive workloads

JackYoustra•4mo ago
iirc most of the energy comes from memory IO not arithmetic, so it's still not great. A better direction, though.
huevosabio•4mo ago
This is a great explanation, thank you!
porridgeraisin•4mo ago
Utilisation is counted by the OS, it's not exposed as a performance counter by the hardware. Thus, it's limited by the level of abstraction presented by the hardware.

It's useless on CPUs as well, just to a much much lesser extent to the point of it actually being useful.

Basically, the OS sees the CPU as being composed of multiple cores, that's the level of abstraction. Thus, the OS calculates "portion of last second where atleast one instruction was sent to this core" on each core and then reports it. The single number version is an average of each core's value.

On the other hand, the OS cannot calculate stuff inside each core - the CPU hides as part of its abstraction. That is, you cannot know "I$ utilisation", "FPU utilisation", etc,.

In the GPU, the OS doesn't even see each SM (streaming multiprocessor, loosely analogous to a cpu core). It just sees the whole GPU as one black box abstraction. Thus, it calculates utilisation as "portion of last second where atleast one kernel was executing on the whole GPU". It cannot calculate intra-GPU util at all. So one kernel executing on one SM looks the same to the OS, as that kernel executing on tens of SMs!

This is the crux of the issue.

With performance counters (perf for CPU, or nsight compute for GPU), lots of stuff visible only inside the hardware abstraction can be calculated (SM util, warp occupancy, tensor util, etc)

The question then, is why doesn't the GPU schedule stuff on each SM in the OS/driver? Instead of doing it in a microcontroller in the hardware itself on the other side of the interface?

Well, I think it's due to efficiency reasons and also for nvidia to have more freedom to change it without having compat issues due to being tied to the OS, and similar reasons. If that were the case however, then the OS could calculate util for each SM, and then average it, giving you more accurate values - the case with the kernel running on 1 SM will report a smaller util than the case with the kernel executing on 15 SMs.

IME, measuring on nsight compute causes anywhere from a 5% to 30% performance overhead, so if that's ok for you, you can enable it and get more useful measurements.

John23832•4mo ago
The "why not use" section should probably include nvtop?
sirukinx•4mo ago
Fair, but I believe that this is intended for a web browser rather than a terminal.
w-m•4mo ago
Possibly also nvitop, which is a different tool from nvtop: https://github.com/XuehaiPan/nvitop
github-trending•4mo ago
nvitop actually is a super cool project
phyalow•4mo ago
Absolutely.
andrewg1bbs•4mo ago
This is really cool, but I tend to prefer NVtop for now.
Havoc•4mo ago
Oh that’s neat. Been looking for a way to see vram temps on Linux
huevosabio•4mo ago
In app.py it seems like you call nvidia-smi as a subprocess and then scrape that. Are there no bindings to do that directly?
peterdsharpe•4mo ago
What is the benefit of this over `watch nvidia-smi`, possibly prepended with an `ssh` in the case of a remove server?
github-trending•4mo ago
nothing super special to be honest. It's just a quick way for me to take a look at a couple of GPU boxes from the browser. Sometimes I check it from the ipad too
xtreme•4mo ago
"nvidia-smi -l <#num seconds>" works even better.
onefortree•4mo ago
This is awesome! Tested it out while running some plex encoding and everything worked as expected!

I did notice that nvidia-smi shows the process name as plex-transcoding but gpu-hot is showing [Not Found]. Not sure if that is where the process name is supposed to go

github-trending•4mo ago
Thanks a lot!! yes I have to check the names
observationist•4mo ago
The AI/vibe coded "purple" color scheme is a meme at this point - might want to tweak the look and feel to not be so on the nose, but it's otherwise a good dashboard.
ionwake•4mo ago
nah I like it
moomoo11•4mo ago
Not gonna check the code I have other things to do but tailwind iirc by default has some purplish color. And it is pretty common because of that.

I think AI vibe codes that because it’s probably seen that default so much.

villgax•4mo ago
sudo apt install nvtop

// solves everything at the above container claims to do lol

github-trending•4mo ago
True nvtop is super useful but sometimes I want to be able to take a quick look from the browser
guluarte•4mo ago
another option is to use Prometheus+grafana https://docs.nvidia.com/datacenter/cloud-native/gpu-telemetr...
github-trending•4mo ago
thats a solid solution, but you have to configure prometheus/grafana etc, but yes grafana rocks

check also netdata amazing project

jedbrooke•4mo ago
I’m skeptical of “no ssh” being a benefit. I’d rather have one port opened to the battle tested ssh proc (which I probably have already anyway), than open a port to some random application.

I suppose it’s trivial to proxy a http port over ssh though so that would seem like a good solution

github-trending•4mo ago
i mean i dont have to ssh over my local gpu server every time i want to have a quick look on the gpus
jedbrooke•4mo ago
that’s true, this would be pretty convenient for local environments
nisten•4mo ago
half readable color scheme.. random python and javascript mixed in, ships with 2 python CVEs out of the box out of 5 total dependencies... yep it checks out bois...certified infested slop

  python-socketio==5.8.0: 1 CVE (CVE-2025-61765); Remote Code Execution via malicious pickle deserialization in multi-server setups.
  eventlet==0.33.3: 1 CVE (CVE-2025-58068); HTTP request smuggling from improper trailer handling.

And then economists wonder why are none of these people getting jobs...
pixl97•4mo ago
I mean the python-socketio is from a few days ago and likely doesn't affect this package (it's not using message queues, right?)

Eventlet .33 is ancient, no idea why they would use that.

With this said, most people should have some kind of SCA to ensure they're not using ancient packages. Conversely picking up a package the day it's released has bit a lot of people when the repository in question gets pwned.

alfalfasprout•4mo ago
TBH this seems useful only for a very select niche.

If you're a company and you have several GPU machines in a cluster, then this is kinda useless b/c you'd have to go on each container or node to view the dashboard.

Sure, there's a cost to using opentelemetry + whatever storage+viz backend, but once it's set up you can actually do alerting, historical views, analysis, etc. easily.

Cieric•4mo ago
This looks neat and would probably be cool to run on one of my passive info screens. But until it supports more than just Nvidia I'll have to stick with nvtop. Might be a good idea to pull the theme out to a file so it's all swappable too (assuming you haven't, I can't look at the code right now.)
iJohnDoe•4mo ago
Some negativity in the comments here.

I think it’s super cool. Clean design. Great for your local self-hosted system or one of your local company systems in the office.

If you have a fleet of GPUs then maybe use your SSH CLI. This is fun and cool looking though.

Avlin67•3mo ago
why not push metrics to grafana ?