A given event happens at a rate of every 10 minutes on average. We can see that:
- The expected length of the interval between events is 10 minutes.
- At a random moment in time the expected wait until the next event is 10 minutes.
- At the same moment, the expected time passed since the last event is also 10 minutes.
But then we would expect the interval between two consecutive events to be 10+10 = 20 minutes long. But we know intervals are 10 on average. What happened here?
The key is that by picking a random moment in time, you're more likely to fall into a bigger intervals. By sampling a random point in time the average interval you fall into really is 20 minutes long, but by sampling a random interval it is 10.
Apparently this is called the Waiting Time Paradox.
You went astray when you declared the expected wait and expected passed.
Draw a number line. Mark it at intervals of 10. Uniformly randomly select a point on that line. The expected average wait and passed (ie forward and reverse directions) are both 5, not 10. The range is 0 to 10.
When you randomize the event occurrences but maintain the interval as an average you change the range maximum and the overall distribution across the range but not the expected average values.
To see this, consider just two intervals of length x and 2-x, i.e. 1 on average. A random point is in the first interval x/2 of the time and in the second one the other 1-x/2 of the time, so the expected length of the interval containing a random point is x/2 * x + (1-x/2) * (2-x) = x² - 2x + 2, which is 1 for x = 1 but larger everywhere else, reaching 2 for x = 0 or 2.
But this process has no “memory” so no matter how much time has passed since the last event, the number of events expected during the next unit of time is still lambda.
Beta
But permanently redirecting so I can't see this after I enable javascript is just uncool and might not endear one on site like hn where lots of folks disable js initially.
Edit: and anonymizing, disabling and reloading... It's just text with formatted math. Sooo many other solutions to this, jeesh guys.
https://garcialab.berkeley.edu/courses/papers/Clarke1946.pdf
https://www.acsu.buffalo.edu/~adamcunn/probability/probabili...
meatmanek•14h ago
1. They're often a pretty good approximation for how web requests (or whatever task your queuing system deals with) arrive into your system, as long as your traffic is predominantly driven by many users who each act independently. (If your traffic is mostly coming from a bot scraping your site that sends exactly N requests per second, or holds exactly K connections open at a time, the Poisson distribution won't hold.) Sort of like how the normal distribution shows up any time you sum up enough random variables (central limit theorem), the Poisson arrival process shows up whenever you superimpose enough uncorrelated arrival processes together: https://en.wikipedia.org/wiki/Palm%E2%80%93Khintchine_theore...
2. They make the math tractable -- you can come up with closed-form solutions for e.g. the probability distribution of the number of users in the system, the average waiting time, average number of users queuing, etc: https://en.wikipedia.org/wiki/M/M/c_queue#Stationary_analysi... https://en.wikipedia.org/wiki/Erlang_(unit)#Erlang_B_formula
emmelaich•11h ago
Also related to the Birthday Problem and hash bucket hits. Though with those you're only interested in low collisions. With some queues (e.g. database above) you might be interested when collisions hit a high number.
PessimalDecimal•9h ago
[1] https://en.wikipedia.org/wiki/Poisson_distribution#Maximum_e...
[2] https://en.wikipedia.org/wiki/Normal_distribution#Maximum_en...