Measuring Latency (2015)

https://bravenewgeek.com/everything-you-know-about-latency-is-wrong/

11•dempedempe•1h ago

Comments

tomhow•1h ago

One previous discussion at time of publication:

A summary of how not to measure latency - https://news.ycombinator.com/item?id=10732469 - Dec 2015 (3 comments)

Fripplebubby•1h ago

> This is partly a tooling problem. Many of the tools we use do not do a good job of capturing and representing this data. For example, the majority of latency graphs produced by Grafana, such as the one below, are basically worthless. We like to look at pretty charts, and by plotting what’s convenient we get a nice colorful graph which is quite readable. Only looking at the 95th percentile is what you do when you want to hide all the bad stuff. As Gil describes, it’s a “marketing system.” Whether it’s the CTO, potential customers, or engineers—someone’s getting duped. Furthermore, averaging percentiles is mathematically absurd. To conserve space, we often keep the summaries and throw away the data, but the “average of the 95th percentile” is a meaningless statement. You cannot average percentiles, yet note the labels in most of your Grafana charts. Unfortunately, it only gets worse from here.

I think this is getting a bit carried away. I don't have any argument against the observation that that average of a p95 is not something that mathematically makes sense, but if you actually understand what it is, it is absolutely still meaningful. With time series data, there is always some time denominator, so it really means (say) "the p95 per minute averaged over the last hour", which is or can be meaningful (and useful at a glance).

Also, the claim that "[o]nly looking at the 95th percentile is what you do when you want to hide all the bad stuff" is very context dependent. As long as you understand what it actually means, I don't see the harm in it. The author makes this point that, because a load of a single webpage will result in 40 requests or so, you are much more likely to hit a p99 and so you should really care about p99 and up - more power to you, if that's the contextually appropriate, then that is absolutely right, but that really only applies to a webserver serving webpage assets which is only one kind of software that you might be writing. I think it is definitely important to know, for one given "eyeball" waiting on your service to respond, what the actual flow is - whether it's just one request, or multiple concurrent requests, or some kind of dependency graph of calls to your service all needed in sequence - but I don't really think that challenges the commonsense notion of latency, does it?

camel_gopher•14m ago

Nearly all time series databases store single value aggregations (think p95) over a time period. A select few store actual serialized distributions (Atlas from Netflix, Apica IronDB, some bespoke implementations). Latency tooling is sorely overlooked mostly because the good tooling is complex, and requires corresponding visualization tooling. Most of the vendors have some implementation of heat map or histogram visualization but either the math is wrong or the UI can’t handle a non trivial volume of samples. Unfortunately it’s been a race to the bottom for latency measurement tooling, with the users losing.

Source: I’ve done this a lot

rdtsc•59m ago

10 years old and still relevant. Gil created a wrk fork https://github.com/giltene/wrk2 to handle coordinated omission better. I used using his fork for many years. But I think he stopped updating it after a while.

Good load testing tools will have modes to send in data at a fixed rate regardless of other requests to handle coordinated omission. k6 for instance defined these modes are "open" and "closed": https://grafana.com/docs/k6/latest/using-k6/scenarios/concep.... They mention the term "coordinated omission" on the page however I feel like they could have given a nod to Gil for the inventing term.

Nano Banana Pro

Android and iPhone users can now share files, starting with the Pixel 10

FEX-emu – Run x86 applications on ARM64 Linux devices

New Glenn Update

Exploring the Fragmentation of Wayland, an xdotool adventure

New OS aims to provide (some) compatibility with macOS

Why top firms fire good workers

Data-at-Rest Encryption in DuckDB

NTSB Preliminary Report – UPS Boeing MD-11F Crash [pdf]

Over-regulation is doubling the cost

GitHut – Programming Languages and GitHub (2014)

The Lions Operating System

CBP is monitoring US drivers and detaining those with suspicious travel patterns

Okta's NextJS-0auth troubles

He built underground maze of light-filled earth homes in CA Sierras [video]

Virgin and Qantas to ban use of portable power banks after string of fires

The Banished Bottom of the Housing Market

Measuring Latency (2015)

Free interactive tool that shows you how PCIe lanes work on motherboards

Microsoft makes Zork open-source

World Othello Championship Finals

Show HN: F32 – An Extremely Small ESP32 Board

Adversarial poetry as a universal single-turn jailbreak mechanism in LLMs

Launch HN: Poly (YC S22) – Cursor for Files

Autocomp: An ADRS Framework for Optimizing Tensor Accelerator Code

Ask HN: How are Markov chains so different from tiny LLMs?

Two recently found works of J.S. Bach presented in Leipzig [video]

Interactive World History Atlas Since 3000 BC

Show HN: My hobby OS that runs Minecraft

OOP is shifting between domains, not disappearing

Measuring Latency (2015)

Comments

Nano Banana Pro

Android and iPhone users can now share files, starting with the Pixel 10

FEX-emu – Run x86 applications on ARM64 Linux devices

New Glenn Update

Exploring the Fragmentation of Wayland, an xdotool adventure

New OS aims to provide (some) compatibility with macOS

Why top firms fire good workers

Data-at-Rest Encryption in DuckDB

NTSB Preliminary Report – UPS Boeing MD-11F Crash [pdf]

Over-regulation is doubling the cost

GitHut – Programming Languages and GitHub (2014)

The Lions Operating System

CBP is monitoring US drivers and detaining those with suspicious travel patterns

Okta's NextJS-0auth troubles

He built underground maze of light-filled earth homes in CA Sierras [video]

Virgin and Qantas to ban use of portable power banks after string of fires

The Banished Bottom of the Housing Market

Measuring Latency (2015)

Free interactive tool that shows you how PCIe lanes work on motherboards

Microsoft makes Zork open-source

World Othello Championship Finals

Show HN: F32 – An Extremely Small ESP32 Board

Adversarial poetry as a universal single-turn jailbreak mechanism in LLMs

Launch HN: Poly (YC S22) – Cursor for Files

Autocomp: An ADRS Framework for Optimizing Tensor Accelerator Code

Ask HN: How are Markov chains so different from tiny LLMs?

Two recently found works of J.S. Bach presented in Leipzig [video]

Interactive World History Atlas Since 3000 BC

Show HN: My hobby OS that runs Minecraft

OOP is shifting between domains, not disappearing