xCapture gives you an efficient, always-on observability signal for dimensional performance analysis of thread-level activity. In passive sampling mode on a 104 vCPU machine (~2000 threads), xcapture used 0.07% of a single CPU out of all CPU capacity.
We just had a "launch party" at P99CONF and the 20 minute talk is available at:
https://www.p99conf.io/session/xcapture-v3-efficient-always-...