A simple way to generate random points on a sphere

https://www.johndcook.com/blog/2025/05/06/random-points-on-a-sphere/

60•piinbinary•4d ago

Comments

pavel_lishin•4d ago

> First, it’s intuitively plausible that it works.

Maybe; my first instinct is that there'll be some bias somewhere.

Maybe I'll have some time tonight to play with this in p5js.

pavel_lishin•4d ago

I had some time!

It looks reasonably random to my eye: https://editor.p5js.org/fancybone/sketches/DUFhlJvOZ

guccihat•3d ago

Cool demo. A minor nitpick is that the code (and the article) forgets to handle the special case of a point inside the cube that happens to be exactly (0,0,0). This will result in a divide by zero when the vector is normalized.

NoahZuniga•3d ago

The chance of this happening is less than 1 in 2^128. This will never happen.

pavel_lishin•3d ago

Unless you're demoing it to someone very important, in which case it'll happen twice in a row.

rkomorn•5h ago

Especially if you have already had the conversation with anyone and confidently stated that, yes, the possibility exists but it's so remote that it's just not worth addressing.

cluckindan•6h ago

With long enough timescales, every event with a non-zero probability will eventually happen.

pavel_lishin•3d ago

That nitpick is both minor, and absolutely correct!

spyrja•6h ago

Kind of? If you want the points to be more randomly-distributed, something like this would probably be a better approach: https://editor.p5js.org/spyrja/sketches/eYt7H36Ka

sparky_z•3d ago

That was my first instinct as well, but I thought through it a little more and now it seems intuitively correct to me.

-First of all, it's intuitive to me that the "candidate" points generated in the cube are randomly distributed without bias throughout the volume of the cube. That's almost by definition.

-Once you discard all of the points outside the sphere, you're left with points that are randomly distributed throughout the volume of the sphere. I think that would be true for any shape that you cut out of the cube. So this "discard" method can be used to create randomly distributed points in any 3d volume of arbitrary shape (other than maybe one of those weird pathological topologies.)

-Once the evenly distributed points are projected to the surface of the sphere, you're essentially collapsing each radial line of points down to a single point on the sphere. And since each radial line has complete rotational symmetry with every other radial line, each point on the surface of the sphere is equally likely to be chosen via this process.

That's not a rigorous proof by any means, but I've satisfied myself that it's true and would be surprised if it turned out not to be.

pavel_lishin•3d ago

To me, it seems like there would be less likelihood of points being generated near the surface of the sphere, and that should have some sort of impact.

sparky_z•3d ago

OK, look at it this way. Imagine that, after you generate the points randomly in the cube, and discard those outside the sphere, you then convert the remaining points into 3D polar coordinates (AKA spherical coordinates [0]). This doesn't change the distribution at all, just the numerical representation. So each point is described by three numbers, r, theta, and phi.

You're correctly pointing out that the values of r won't be uniformly distributed. There will be many more points where the value of r is close to 1 then there will be where the value of r is close to 0. This is a natural consequence of the fact that the points are uniformly distributed throughout the volume, but there's more volume near the surface than there is near the center. That's all true.

But now look at the final step. By projecting every point to the surface of the sphere, you've just overwritten every single point's r-coordinate with r=1. Any bias in the distribution of r has been discarded. This step is essentially saying "ignore r, all we care about are the values of theta and phi."

[0]https://en.wikipedia.org/wiki/Spherical_coordinate_system

layer8•7h ago

It should be intuitively clear that rotating the sphere (or the cube) won’t change the distribution of the random points before projection, hence the distribution of the projected points must be independent of the orientation of the sphere, and hence independent of any particular location on the sphere.

Or in other words, if you take the “dotted” sphere out of the cube afterwards, you won’t be able to tell by the dots which way it was originally oriented within the cube.

bob1029•7h ago

The part that makes this work is the rejection aspect.

What would be biased is if you inscribed a cube in the unit sphere. This would require additional transformations to create a uniform distribution. If you simply "throw away" the extra corner bits that aren't used, it won't affect the distribution.

arvindh-manian•4d ago

> The advantage of this approach is that it generalizes efficiently to any number of dimensions.

I am unsure about whether this is true. The ratio of a ball’s volume to its enclosing hypercube’s volume should decrease to 0 as dimensionality increases. Thus, the approach should actually generalize very poorly.

alberto_balsam•3d ago

Note that the author is not referring to the accept-reject method here

arvindh-manian•3d ago

Ah I misread this, thanks

scythe•7h ago

Let S = {S_i} be any set of cubes that covers a d-sphere. Choose a point in a cube and an integer i in [0, |S|). Now you have a random point in S. With a judicious choice of S you obtain a uniformly random point in the unit sphere with high probability.

egorfine•3d ago

Please forgive me my naivete, but won't generating two random polar coordinates do? I'm bad at math, so I might as well be very very wrong here, but I'd like to know.

Edit: see @srean's excellent explanation why that won't do.

danwills•3d ago

I think it can be done that way yeah but in order to yield a uniform-density of points on the surface of the sphere there's some pre-correction (maybe a sqrt or something? I can't remember) that's needed before feeding the 'uv' values to the trig functions to make 3D positions. Otherwise points will 'bunch up' and be more dense at the poles I think.

egorfine•3d ago

Generating two random 1..360 numbers and converting them to xyz would bunch up at the poles?

danwills•3d ago

Yeah @srean gives the example of the different areas of strips at different lattitude, that's a good one - and I think if you imagine wrapping the unit square (2 values randomly between 0 and 1) to a sphere in a lat-long way, the whole top and bottom edges of the square get contracted to single points at the top and bottom latitude locations (respectively) on the sphere.. so if the point density was uniform going into that then it surely won't be afterwards ;)

egorfine•3d ago

See @srean's explanation above.

srean•3d ago

Indeed.

One way to fix the problem is to sample uniformly not on the latitude x longitude rectangle but the sin (latitude) x longitude rectangle.

The reason this works is because the area of a infinitesimal lat long patch on the sphere is dlong x lat x cosine (lat). Now, if we sample on the long x sin(lat) rectangle, an infinitesimal rectangle also has area dlong x dlat x d/dlat sin(lat) = dlong x dlat cos (lat).

Unfortunately, these simple fixes do not generalize to arbitrary dimensions. For that those that exploit rotational symmetry of L2 norm works best.

srean•3d ago

If you want uniformly random on the spherical surface then uniformly at random in polar coordinates will not cut it.

To appreciate why, consider strips along two constant latitudes. One along the Equator and the other very close to the pole. The uniformly random polar coordinates method will assign roughly the same number points to both. However the equatorial strip is spread over a large area but the polar strip over a tiny area. So the points will not be uniformly distributed over the surface.

What one needs to keep track of is the ratio between the infinitesimal volume in polar coordinates dphi * dtheta to the infinitesimal of the surface area. In other words the amount of dilation or contraction. Then one has apply the reciprocal to even it out.

This tracking is done by the determinant of the Jacobian.

egorfine•3d ago

This is now crystal clear and obvious to me, thank you very much for the great explanation!

srean•3d ago

Happy to help.

dan-robertson•6h ago

Looking at Jacobians is the general method but one can rely on an interesting property: not only is the surface area of a sphere equal to the surface area of a cylinder tightly enclosing it (not counting end caps), but if you take a slice of this cylinder-with-sphere-inside, the surface area of the part of the sphere will be equal to the surface area of the shorter cylinder that results from the cutting.

This gives an algorithm for sampling from a sphere: choose randomly from a cylinder and then project onto a sphere. In polar coordinates:

  sample theta uniformly in (0,2pi)
  sample y uniformly in (-1,1)
  project phi = arcsin(y) in (-pi,pi)
  polar coordinates (theta, phi) define describe random point on sphere

Potentially this is slower than the method in the OP depending on the relative speeds of sqrt and arcsin.

spyrja•5h ago

That's a neat approach! So basically something like this: https://editor.p5js.org/spyrja/sketches/eYt7H36Ka

scotty79•4h ago

This seems more like a random point on a spiral on the sphere. There was a thing on hn about spirals on a sphere few days ago.

spyrja•4h ago

Ah, good catch. I forgot to scale the point properly! How does the updated sketch look?

EDIT: Plotting it out as a point cloud seems to confirm your suspicion.

Without scaling: https://editor.p5js.org/spyrja/sketches/7IK_RssLI

Scaling fixed: https://editor.p5js.org/spyrja/sketches/kMxQMG0dj

krackers•4h ago

I think this 2D version shows the issue clearly

https://mathworld.wolfram.com/DiskPointPicking.html

navane•5h ago

i feel somehow three rotations should be able to do it, 3 rands between 0 and 2pi.

jacobolus•3d ago

Here are a few alternatives:

https://observablehq.com/@jrus/stereorandom

At least when trying to end up with stereographically projected coordinates, in general it seems to be faster to uniformly generate a point in the disk by rejection sampling and then transform it by a radially symmetric function to lie on the sphere, rather than uniformly generating a point in the ball and then projecting outward. For one thing, fewer of the points get rejected because the disk fills more of the square than the ball fills of the cube.

mkaic•7h ago

My favorite way to generate random points on a n-dimensional sphere is to just sample n times from a Gaussian distribution to get a n-dimensional vector, and then normalizing that vector to the radius of the desired sphere.

ravin_gees•6h ago

Wonder if you get any numerical instability here in high dimensions by doing a sum of exponentials? Probably not because they’re Gaussian (no long tails) but after looking at scipy.special.logsumexp [1] I’m a bit wary of sums of exponentials with float32. Would be curious to see if there’s any characterization of this (the cited paper in the article only considers the low dimensional case)

[1] https://docs.scipy.org/doc/scipy/reference/generated/scipy.s...

_alternator_•6h ago

The only reason I can think of that you’re getting downvoted because this is mentioned in the article. This is a strictly better method than the accept/reject method for this application. The runtime of the accept reject algorithm is exponential in the dimension because the ratio between the volume of the sphere is exponentially smaller than the volume of the hypercube.

I’d also point out that the usual way Gaussians are generated (sample uniformly from the unit interval and remap via the Gaussian percentile) can be expressed as sampling from a d-dimensional cube and then post-processing as well, with the advantage that it works in one shot. (Edit: typo)

Sharlin•5h ago

Mentioned in the article. Surely you read it, didn't you?

sfpotter•6h ago

Hmm, I don't buy it. The simplicity of just normalizing some Gaussian random deviates (especially since you generate them two at a time using Box-Muller) seems better than accept-reject. Especially considering that the ratio of the volume of the n-dimensional ball to the volume of [-1, 1]^n tends to zero as n tends to infinity exponentially fast...

Sharlin•5h ago

Aymptotic behavior doesn't really matter given that this algorithm is almost exclusively used with n=3.

nwallin•5h ago

Accept-reject methods are nonstarters when the architecture makes branching excessively expensive, specifically SIMD and GPU, which is one of the domains where generating random points on a sphere is particularly useful.

The Box-Muller transform is slow because it requires log, sqrt, sin, and cos. Depending on your needs, you can approximate all of these.

log2 can be easily approximated using fast inverse square root tricks:

    constexpr float fast_approx_log2(float x) {
      x = std::bit_cast<int, float>(x);
      constexpr float a = 1.0f / (1 << 23);
      x *= a;
      x -= 127.0f;
      return x;
    }

(conveniently, this also negates the need to ensure your input is not zero)

sqrt is pretty fast; turn `-ffast-math` on. (this is already the default on GPUs) (remember that you're normalizing the resultant vector, so add this to the mag_sqr before square rooting it)

The slow part of sin/cos is precise range reduction. We don't need that. The input to sin/cos Box-Muller is by construction in the range [0,2pi]. Range reduction is a no-op.

For my particular niche, these approximations and the resulting biases are justified. YMMV. When I last looked at it, the fast log2 gave a bunch of linearities where you wanted it to be smooth, however across multiple dimensions these linearities seemed to cancel out.

jlebar•4h ago

fastmath is absolutely not the default on any GPU compiler I have worked with (including the one I wrote).

If you want fast sqrt (or more generally, if you care at all about not getting garbage), I would recommend using an explicit approx sqrt function in your programming language rather than turning on fastmath.

thoughtFrame•4h ago

I've read about the fast inverse square root trick, but it didn't occur to me that it can be used for other formulas or operations. Is this a common trick in DSP/GPU-like architectures nowadays?

And what's the mathematical basis? (that is, is this technique formalized anywhere?)

It seems insane to me that you run Newton's algorithm straight on the IEEE 754 format bits and it works, what with the exponent in excess coding and so on

AlotOfReading•3h ago

You can use the same techniques as fast inverse sqrt anywhere logs are useful. It's not particularly common these days because it's slower than a dedicated instruction and there are few situations where the application is both bottlenecked by numerical code and is willing to tolerate the accuracy issues. A pretty good writeup on how fast inverse sqrt works was done here: https://github.com/francisrstokes/githublog/blob/main/2024%2...

A lot of old-school algorithms like CORDIC went the same way.

There's a related technique to compute exponentials with FMA that's somewhat more useful in ML (e.g. softmax), but it has similar precision issues and activation functions are so fast relative to matmul that it's not usually worth it.

nwallin•1h ago

1/sqrt(x) is complicated. Imagine instead of computing 1/sqrt(x), imagine instead that you wanted to compute exp_2(-.5 log_2(x)). Also imagine you have an ultra fast way to compute exp_2 and log_2. If you have an ultra fast way to compute exp_2 and log_2, then exp_2(-.5 log_2(x)) is gonna be fast to compute.

It turns out you do have an ultra fast way to compute log_2: you bitcast a float to an integer, and then twiddle some bits. The first 8 bits (after the sign bit, which is obviously zero because we're assuming our input is positive) or whatever are the exponent, and the trailing 23 bits are a linear interpolation between 2^n and 2^(n+1) or whatever. exp_2 is the same but in reverse.

You can simply convert the integer to floating point, multiply by -.5, then convert back to integer. But multiplying -.5 by x can be applied to a floating point operating directly on its bits, but it's more complicated. You'll need to do some arithmetic, and some magic numbers.

So you're bitcasting to an integer, twiddling some bits, twiddling some bits, twiddling some bits, twiddling some bits, and bitcasting to a float. It turns out that all the bit twiddling simplifies if you do all the legwork, but that's beyond the scope of this post.

So there you go. You've computed exp_2(-.5 log_2 x). You're done. Now you need to figure out how to apply that knowledge to the inverse square root.

It just so happens that 1/sqrt(x) and exp(-.5 log x) are the same function. exp(-.5 log x) = exp(log(x^-.5)) = x^-.5 = 1/sqrt(x).

Any function where the hard parts are computing log_2 or exp_2 can be accelerated this way. For instance, x^y is just exp_2(y log_2 x).

Note that in fast inverse square root, you're not doing Newton's method on the integer part, you're doing it on the floating point part. Newton's method doesn't need to be done at all, it just makes the final result more accurate.

Here's a blog here that gets into the nitty gritty of how and why it works, and a formula to compute the magic numbers: https://h14s.p5r.org/2012/09/0x5f3759df.html

DoctorOetker•5h ago

I was hoping it generated hyperuniform samples on a sphere.

AYBABTME•4h ago

Isn't the density distribution of values going to be higher along the directions pointing in the cube's corners? There's more volume between the sphere and nearby the corners than between the sphere and nearby the faces' centres.

philipdloewen•4h ago

That’s why the method discards points outside the sphere, and returns a normalized point generated from an interior sample.

joshka•4h ago

You can show the exact opposite of this in a degenerate fixed point situation. Say you have -1, 0, +1 in each dimension. The only valid coordinates are the 6 on each face. (+-1, 0, 0) (0, +-1, 0) (0, 0, +-1). Not sure if this is the only counter example. I'd guess that with floating point math and enough bits the bias would be very small and probably even out.

ted_dunning•4h ago

An easy way to make this more efficient is to proceed as normal, but if the point is outside the sphere, run the algorithm again using cyclic xor's of the coordinates. This gives you a free second try without generating new random deviates.

You can't do the XOR in floating point representation, but you can if you do the entire algorithm in fixed point or if you retain the random bits before converting to a floating point value.

This decreases the number of random numbers that need to be generated from ~5.72 to ~3.88.

glitchc•4h ago

Instead of discarding points, why not project them all onto the sphere?

dragonwriter•4h ago

Because then they won't be uniformly distributed on the sphere (they'll be denser in the direction of the cubes corners.)

gerdesj•4h ago

OP is able to create random points to infinite precision or the spherical cow has as many points on it as you like. Many here don't have the luxury of a Hilbert monitor.

If you convert (non mathematician here!) your sphere into an n-agon with an arbitrarily fine mesh of triangular faces, is the method described by OP still valid. ie generate ...

... now I come to think of it, you now have finite faces which are triangular and that leads to a mad fan of tetrahedrons and I am trying to use a cubic lattice to "simplify" finding a series of random faces. Well that's bollocks!

Number the faces algorithmically. Now you have a linear model of the "sphere". Generating random points is trivial. For a simple example take a D20 and roll another D20! Now, without toddling off to the limit, surely the expensive part becomes mapping the faces to points on a screen. However, the easy bit is now the random points on the model of a sphere.

When does a triangular mesh of faces as a model of a sphere become unworkable and treating a sphere instead as a series of points at a distance from a single point - with an arbitrary accuracy - become more or less useful?

I don't think that will be an issue for IT - its triangles all the way and the more the merrier. For the rest of the world I suspect you normally stick with geometry and hope your slide rule can cope.

AGI is an engineering problem, not a model training problem

The cost of interrupted work (2023)

Show HN: How to Build a Coding Agent (free workshop)

I built a tiny mac app to monitor and manage my development processes

Line scan camera image processing for train photography

How can AI ID a cat?

A 2k-year-old sun hat worn by a Roman soldier in Egypt

What makes Claude Code so damn good

Static sites with Python, uv, Caddy, and Docker

Evaluating LLMs for my personal use case

Physics of badminton's new killer spin serve

Librebox: An open source, Roblox-compatible game engine

Taking a look at my old Palm IIIx – by Paul Lefebvre

Acronis True Image costs performance when not used

RFC 9839 and Bad Unicode

Texas Instruments’ new plants where Apple will make iPhone chips

Motion (YC W20) Is Hiring Principal Software Engineers

Why was Apache Kafka created?

Debdelta

DeepCode: Open Agentic Coding

Romhack.ing's Internet Archive Mirror No Longer Available

Hacker and physicist – a tale of "common sense"

Not so prompt: Prompt optimization as model selection (2024)

The Cornervery: A 90-Degree Stapler

Libre – An anonymous social experiment without likes, followers, or ads

Writing Speed-of-Light Flash Attention for 5090 in CUDA C++

Monoid-Augmented FIFOs, Deamortised

Optimizing our way through Metroid

Exploring EXIF (2023)

A simple way to generate random points on a sphere

A simple way to generate random points on a sphere

Comments

AGI is an engineering problem, not a model training problem

The cost of interrupted work (2023)

Show HN: How to Build a Coding Agent (free workshop)

I built a tiny mac app to monitor and manage my development processes

Line scan camera image processing for train photography

How can AI ID a cat?

A 2k-year-old sun hat worn by a Roman soldier in Egypt

What makes Claude Code so damn good

Static sites with Python, uv, Caddy, and Docker

Evaluating LLMs for my personal use case

Physics of badminton's new killer spin serve

Librebox: An open source, Roblox-compatible game engine

Taking a look at my old Palm IIIx – by Paul Lefebvre

Acronis True Image costs performance when not used

RFC 9839 and Bad Unicode

Texas Instruments’ new plants where Apple will make iPhone chips

Motion (YC W20) Is Hiring Principal Software Engineers

Why was Apache Kafka created?

Debdelta

DeepCode: Open Agentic Coding

Romhack.ing's Internet Archive Mirror No Longer Available

Hacker and physicist – a tale of "common sense"

Not so prompt: Prompt optimization as model selection (2024)

The Cornervery: A 90-Degree Stapler

Libre – An anonymous social experiment without likes, followers, or ads

Writing Speed-of-Light Flash Attention for 5090 in CUDA C++

Monoid-Augmented FIFOs, Deamortised

Optimizing our way through Metroid

Exploring EXIF (2023)

A simple way to generate random points on a sphere