I built hud after getting burned by blocking in async Rust services during one too many 2am debugging sessions. Blocking on Tokio worker threads can quietly tank throughput and explode p99 latency, all without panics or anything obvious in the logs.
Most profiling tools technically work, but they expect you to reason clearly, correlate timelines, and keep a full mental model in your head. That’s fine at 2pm. At 2am, it’s hopeless. hud’s goal is to reduce cognitive load as much as possible: show something visual you can understand almost immediately.
The UI is modeled after a trans-Pacific night cockpit: dark, dense, and built to be readable when you’re exhausted.
Under the hood, hud uses eBPF to track scheduling latency—how long worker threads are runnable but not running. This correlates well with blocking (though it’s not a direct measurement). You can attach to a live process with no code changes and get a live TUI that highlights latency hotspots grouped by stack trace.
The usual suspects so far: std::fs, CPU-heavy crypto (bcrypt, argon2), compression (flate2, zstd), DNS via ToSocketAddrs, and mutexes held during expensive work.
Tokio-specific (worker threads identified by name). Linux 5.8+, root, and debug symbols required.
Very open to feedback—especially around false positives or flawed assumptions. Happy to answer questions.
cong-or•1h ago
Most profiling tools technically work, but they expect you to reason clearly, correlate timelines, and keep a full mental model in your head. That’s fine at 2pm. At 2am, it’s hopeless. hud’s goal is to reduce cognitive load as much as possible: show something visual you can understand almost immediately.
The UI is modeled after a trans-Pacific night cockpit: dark, dense, and built to be readable when you’re exhausted.
Under the hood, hud uses eBPF to track scheduling latency—how long worker threads are runnable but not running. This correlates well with blocking (though it’s not a direct measurement). You can attach to a live process with no code changes and get a live TUI that highlights latency hotspots grouped by stack trace.
The usual suspects so far: std::fs, CPU-heavy crypto (bcrypt, argon2), compression (flate2, zstd), DNS via ToSocketAddrs, and mutexes held during expensive work.
Tokio-specific (worker threads identified by name). Linux 5.8+, root, and debug symbols required.
Very open to feedback—especially around false positives or flawed assumptions. Happy to answer questions.
https://github.com/cong-or/hud