So, between .5 and 1 you miss out on every second representable number, between 0.25 and .5 you miss out on 3/4 of them, and so on.
I guess for many cases that's good enough, but the article seems like a nice improvement.
ETA: Lemire has some thoughts on this [1] and links to what might be a prior solution [2]. Vigna (of xoroshiro fame) writes about it at the bottom of [3] and also links to [2]. So, presumably the implementation described in the article is faster? ("There have been some past attempts to fix these flaws, but none that avoid a huge performance penalty while doing so."
EDIT2: BTW, one of the things I love about HN (well, the world, really) is that there are people that care deeply that we can uniformly sample floats between 0 and 1 correctly, and all of them, and do it faster.
[0] see https://github.com/JuliaLang/julia/blob/master/stdlib/Random...
rand(r::AbstractRNG, ::SamplerTrivial{CloseOpen01_64})
= rand(r, CloseOpen12()) - 1.0
[1] https://lemire.me/blog/2017/02/28/how-many-floating-point-nu...[2] https://mumble.net/~campbell/2014/04/28/uniform-random-float
There are more than 2^53 floating point values between 0 and 1, but if you use them all, you'll fail to have a uniform distribution. Either certain regions of the space will be far more likely than other regions, or certain values will be far more likely than other values. The article takes the view that the second of those options is desirable, but it's not clear why.
And the algorithm the article describes is:
1. Generate a random float the uniform way.
2. If it's close to zero and has extra precision available, generate some more random bits to fill in the extra precision.
It's always going to be faster to skip step 2. What are we gaining by not skipping it?
2^53 equally-spaced values within [0, 1) is plenty enough for many use cases, but it's still fundamentally a set of fixed-point rather than floating-point numbers. For a true random floating-point value, all the available precision should be used such that every single value in the domain has some probability mass (except maybe subnormals, but you can usually forget about subnormals). Especially with 32-bit floats it's not that difficult to run out of precision when you start doing math and only have a fixed-point subset of size 2^23 available.
Neither of these things is a spaceship, but, it will be obvious to almost anybody why the Punto isn't a spaceship whereas you are less likely to know enough about the F-14 to see why it's not "good enough".
They are integers. Each float is a pattern of 64 discrete bits. Discretion means integers.
x = 0.5 (with bit pattern 0x3f000000)
y = 0.25 (with bit pattern 0x3e800000)
then x+y = 0.75 (with bit pattern 0x3f400000)
But if we just added the bit patterns like integers, we would get 0x7d800000 (with a float value of over 10^37). Just because they are discrete doesn't mean they are integers only that they can be mapped one-to-one with integers. The bit pattern is not the semantic meaning of the value, and you cannot perform correct operations if you ignore the semantics.Only if you generate them all with equal probability.
The way I would define a uniform distribution is the following:
For any two floating-point numbers, r1 and r2, which form the range [r1,r2] over the real numbers, and any second pair of floating point numbers s1 and s2, which form a range [s1,s2] over the real numbers, which is contained in [r1,r2]. The probability of getting a result in [s1,s2] when sampling from [r1,r2] must be equivalent to the result of (s2-s1)/(r2-r1) with infinite precision.
This is obviously possible to achieve.
Think of it as binary subdivision/search of [0,1], using a stream of random bits. Steps:
(1) Divide the current interval in 2 (using its midpoint); (2) Using one random bit, pick one of the 2 intervals; (3) if the picked interval (new current interval) lies entirely on the domain of a single floating point value[def1], stop and return this value, else go to (1).
[def1] The domain associated to a floating point value is the interval between the midpoint between it and its lower neighbor on the left, and the midpoint between it and its higher neighbor on the right.
I expect the performance is very poor, but it does cover all floating point numbers in [0,1] with exactly correct probabilities (assuming the bit probabilities are exactly correct). That's in part because naively you need higher precision arithmetic to do so, as well as up to 53 or so iterations, on average well over 32 I presume.
(I've left out the proof that probabilities are correct, but I believe it's easy to show)
Having any number be possible is nice. Having the entire space be reachable when you're sampling a solution space and the solution may be smaller than 1/8. Or having small vector quantities not have their angles ridiculously constrained by quantization.
Most applications don't need it-- indeed, a double has a lot of precision compared to what most people care about, and throwing away some of it at the low end of the randomness range is no big deal. But sometimes, a double is barely (or not even) enough.
You're out of luck there; no matter what happens you can only generate floating point numbers. If you're worried that the solution lies within the space of floating point numbers, but not within the part that's reachable by generating uniform samples with 1/2^53 resolution, you need to be much, much, much more worried that the solution isn't a floating point number at all.
By contrast, if you're unconcerned with the numeric properties of the numbers you're generating, and what you want is to sample "floating point values" to examine what kind of implementation-specific behavior they might have, you would never want this scheme - you'd want to generate random floating-point values, which you can do by just filling in random bits. Those will be concentrated around zero, which doesn't matter because you're not interpreting them as numbers.
2^53 is 15 sigfigs. My answer might be around 10 pico-something and I might need a few digits of precision. Then I've thrown away 11 of my sigfigs and there may not be a numerically stable solution with what's left.
As you point out, double precision often isn't enough. But it's much more likely to not be enough if we throw large parts of the precision away.
That's fine, as for the normal understanding of uniformity on the interval, you want the probability that a value is chosen to be proportional to the "space" between its left and right neighbor, and that space is not constant on the interval.
What you try to emulate is picking a real number uniformly between 0 and 1, then returning the closest floating point number (except 1, maybe).
I'm a contributor to the PHP programming language and a maintainer of the randomness functionality and did some research into this topic to provide a floating point generation method that is as good as possible and came across the "Drawing Random Floating-point Numbers from an Interval" paper by Frédéric Goualard.
PHP 8.3 ships with a method generating random floats from arbitrary intervals based on the γ-section algorithm described in that paper. The PHP documentation goes into additional detail:
https://www.php.net/manual/en/random-randomizer.getfloat.php
As part of the implementation I also reached out to Prof. Goualard with some questions for clarification regarding the behavior for extreme values, which resulted in an Corrigendum being published, since the algorithm was buggy in those cases (https://frederic.goualard.net/publications/corrigendum.pdf). The entire exchange with Prof. Goualard was really pleasant.
* Well, almost. I messed up the probability around subnormals slightly and haven't gotten around to fixing it yet.
Still, the algorithm it self should works for all other values and it was measurably more accurate than the regular algorithm.
In terms of performance it is 10x slower compared to the regular implementation of `(randu32(x)>>8)*(1.0f/(1<<24))` on my Zen1 desktop.
i.e. what percentage of the range do they occupy and how often would one expect to need to sample to obtain one?
One can reason to this conclusion by noting that the subnormals have the same effective exponent as the smallest normal number (except that they use the 0.xxx representation instead of 1.xxx; the exponent in question is -126 for floats and -1022 for doubles), and then observing that all normal numbers with exponents from that minimum up to and including -1 are also in the range. Since the number 1 is usually not included in the interval, we can ignore it, but even if it was, it would be the only representative from its exponent (0) anyway, so the effect on the size of the range would be miniscule.
Even calculating your exponent with a single count_trailing_zeros on a single 64 bit value will give you results indistinguishable from perfect.
https://www.jstatsoft.org/index.php/jss/article/view/v007i03...
(Dieharder is probably a better suite to use at this point, but that paper is approachable.)
https://pharr.org/matt/blog/2022/03/05/sampling-fp-unit-inte...
https://marc-b-reynolds.github.io/distribution/2017/01/17/De...
Immediately there are alarms going off in my mind. Your machine definitely can't pick real numbers, Almost All of them are non-computable. So you definitely can't be doing that, which means you should write down what you've decided to actually do because it's not that.
What you actually want isn't the reals at all, you want a binary fraction (since all your f64s are in fact binary fractions) between 0 and 1, and so you just need to take random bits until you find a one bit (the top bit of your fraction), counting all the zeroes to decide the exponent, then you need 52 more bits for your mantissa.
I'm sure there's a faster way to get the same results, but unlike the article's imagined "drawing a real number" this is actually something we can realize, since it doesn't involve non-computable numbers.
Edited: You need 52 more bits, earlier this comment said 51 but the one bit you've already seen is implied in the floating point type, so we need 52 more, any or all of which might be zero.
This unimplementable algorithm for an imaginary machine is still useful, because it nails down the probability distribution we expect the output samples to have. And because they're sampled from a finite value space and the probabilities of obtaining each possible output are rational numbers, we have a chance of finding an implementable algorithm for a real machine that produces the same output distribution.
The rest of the article goes into detail on what they decided to actually do. It's similar to your proposal but has more edge-case handling and deals with different rounding modes.
Sure, but there is a simple correspondence between modern/typical computers (that does require a potentially unlimited supply of new hard drives, but that is rarely relevant) and Turing machines, while such a correspondence can probably never exist for the "real number" model.
I had hoped that, somewhere in the article, the author would say, "In this article, the term 'random' is shorthand for 'pseudorandom'." But no.
Programming students might read the article and come away with the idea that a deterministic algorithm can generate random numbers.
This is like the sometimes-heard claim that "Our new error-detecting algorithm discovers whether a submitted program contains errors that might make it stop." Same problem -- wrong as written, but no qualifiers.
It will be just as happy with dice rolling, the PRNG from Commodore 64 BASIC, the built-in random numbers from a modern Intel CPU or "random" bits harvested by sampling keyboard input, it just takes the "random" bits and makes floating point values.
So there's no reason to say it must be pseudo-random. In most practical applications it would be, but that doesn't seem essential AFAICT
(Sry, none native English speaker, in German:) "Moment mal, sagst Du da gerade, mit aller höchster Wahrscheinlichkeit gibt es gar keine zufälligen Zahlen, die ein Computer generieren könne, da 'Bits' immer nur entweder den Wert '0' oder eben '1' annehmen können (und das auf inzwischen mehreren Layern, andere Geschichte...)?"
let me try to explain it a little...
3 * (A = >0) adding... 1 * (A = 1) (..another topic yes...)
...but that what was wanted is "NONE" (of it...) or in other words 'maybe random'??
??
:confused:
So, if you want a uniform variate, but you want every number in the range to be possible to generate, it's tricky. Each individual small number needs to be much less likely than each individual large number.
If you just draw from the 2^62 space randomly, you almost certainly get a very small number.
double random_real(double low, double high, int (*random_bit)(void)) {
if (high < low)
return random_real(high, low, random_bit);
double halfway, previous = low;
while (true) {
halfway = low + (high - low) / 2;
if (halfway == previous)
break;
if (random_bit() & 1)
low = halfway;
else
high = halfway;
previous = halfway;
}
return halfway;
}
That should theoretically produce a uniformally-distributed value. (Although perhaps I've missed some finer point?)Secondly, why does this generate a uniform random number? It’s not clear to me at all. It seems it would suffer the exact problem GP’s talking about here, that certain ranges of numbers would have a much higher probability than others on a weighted basis.
Each interval of equal size occurs with equal likelihood at each step.
Consider that you want to generate a random number between 0 and 1024 (excl.). The midpoint would be 512, thus you choose randomly whether the lower interval [0, 512) or the higher interval [512,1024) is selected. In each step, the range size is independent of the concrete numbers, i.e. for it is exactly 2^(-step_size) * (high - low), and in each step each range has equal probability. Thus if the algorithm terminates, the selected number was in fact uniformly sampled.
I would also presume it must terminate. Assume that the two endpoints are one ulp apart. The midpoint is thus either of the two, there is no randomness involved (barring FPU flags but they don't count). In the next step, the algorithm either terminates or sets the endpoints equal, which also fixes the midpoint. Thus the procedure always returns the desired result.
After some testing I discovered a problem but not with the comparison. The problem is with calculating the halfway value. There are some doubles where their difference cannot be represented as a double:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <float.h>
double random_real(double low, double high, int (*random_bit)(void)) {
if (high < low)
return random_real(high, low, random_bit);
double halfway, previous = low;
while (1) {
halfway = low + (high - low) / 2;
if (halfway == previous)
break;
if (random_bit() & 1)
low = halfway;
else
high = halfway;
previous = halfway;
}
return halfway;
}
int main(int argc, char *argv[]) {
srand(time(NULL));
for (int i = 0; i < 1000000; i++) {
double r = random_real(-DBL_MAX, DBL_MAX, rand);
printf("%f\n", r);
}
}
double random_real(double low, double high, int (*random_bit)(void)) {
if (high < low)
return random_real(high, low, random_bit);
const double max = DBL_MAX / 2;
if (high > max)
high = max;
const double min = -max;
if (low < min)
low = min;
double halfway, previous = low;
while (1) {
halfway = low + (high - low) / 2;
if (halfway == previous)
break;
if (random_bit() & 1)
low = halfway;
else
high = halfway;
previous = halfway;
}
return halfway;
}
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <float.h>
double random_real(double low, double high, int (*random_bit)(void)) {
if (high < low)
return random_real(high, low, random_bit);
double halfway, previous = low;
while (1) {
halfway = low + (high/2 - low/2);
if (halfway == previous)
break;
if (random_bit() & 1)
low = halfway;
else
high = halfway;
previous = halfway;
}
return halfway;
}
int main(int argc, char *argv[]) {
srand(time(NULL));
for (int i = 0; i < 1000000; i++) {
double r = random_real(-DBL_MAX, DBL_MAX, rand);
printf("%f\n", r);
}
}
I don't think this is true; I'm guessing that the author borrowed this observation from one of the various other articles linked in this thread, that address the situation where we draw a 64-bit random unsigned integer and divide by `1<<64`, instead of drawing 53 bits and dividing by `1<<53`. The former does introduce a bias, but the latter doesn't (but does still limit the fraction of representable values).
This is e.g. the case for classical LCGs and the xoroshiro*+ family of PRNGs:
> If, however, one has to generate only 64-bit floating-point numbers (by extracting the upper 53 bits) xoshiro256+ is a slightly (≈15%) faster generator with analogous statistical properties. For general usage, one has to consider that its lowest bits have low linear complexity and will fail linearity tests (https://prng.di.unimi.it/)
> When m = 2k there is a further problem. The period of the bth bit of an LCG (where bits are numbered from the right, starting at one) is 2b , thus although the period is 2k, only the high bits are good and the lower order bits exhibit a clear repeating pattern. (https://www.pcg-random.org/pdf/hmc-cs-2014-0905.pdf)
I use the PRNGs here for my own needs and they work great... at least as far as I can tell. :-)
Between 0 and 1, you have about a quarter of all floating point numbers (and then a quarter > 1, a quarter < -1, and a quarter between -1 and 0).
Between 1 and 2, you have about 0.024% of all floating point numbers (for double precision, a factor of around 1023).
kevmo314•17h ago
I do wish the solution were a bit simpler though. Could this be upstreamed into the language's API? https://cs.opensource.google/go/go/+/refs/tags/go1.24.3:src/...
lutusp•15h ago
If a language is in use, and if people have written code that generates pseudorandom sequences based on an initial seed, then no -- bad idea. Some programs rely on the same pseudorandom sequence for a given seed, and may fail otherwise.
wtallis•12h ago