frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Fast Random Library for C++17

https://github.com/DmitriBogdanov/UTL/blob/master/docs/module_random.md
52•GeorgeHaldane•1d ago
Morning HN.

Random number generation feels is a somewhat underrepresented topic in the C++ realm. There is a lot of questionable info about it found online and even the standard library is quite behind the times in terms of it's algorithms. It suffers from trying to accommodate sometimes impractical standard requirements and has several ways of getting significantly bad statistical results. This leaves a lot easily achievable performance & quality on the table.

So, being a mathematician who mostly works with stochastic models and wants these models to run fast and well, I embarked on a journey of trying to summarize "what is good and what is bad" and implement the "best stuff there currently is".

Thankfully, the design of C++ <random> is quite flexible and easy to extend. With some cleanup, generalization and compile-time logic all the different algorithms can be wrapped in a generic standard-compatible API.

A result of this work is single-header RNG library which has:

  - <random>-compatible generators (PRNGSs) with 3x-6x better performance
  - Cryptographically secure generators (CSPRNGs)
  - Faster uniform / normal distributions that produce same sequences on every platform
  - Quick approximations of some non-linear distributions
  - More reliable entropy sources that std::random_device()
  - rand()-like API for when we just want random numbers without the boilerplate of proper a <random> setup
Effectively all of this gets us 2x-8x speedups on many workloads while producing even better statistical quality.

Don't think there is anything else like it, so I would like to showcase the result here and hear some opinions on its improvement:

https://github.com/DmitriBogdanov/UTL/blob/master/docs/modul...

For those interested, there is a more detailed rundown of all the quirks of this topic at end of the docs, might prove an interesting read.

Comments

jeffbee•1d ago
This looks quite helpful. An especially useful feature is to have a stable uniform int distribution, even if it is just a copy of the GNU one. It is incredibly annoying that the standard dictates the output of the generators, but leaves the output of the distributions unspecified.
fluffyllemon•1d ago
How does this compare with Abseil's random library? https://abseil.io/docs/cpp/guides/random
bhickey•1d ago
This library appears to be insecure by default. I think there are vanishingly few use cases for non-crypto RNGs. We made absl random secure by default using randen: https://arxiv.org/abs/1810.02227

The algorithm is provably secure, so long as AES is secure. It is also backtracking resistant: an adversary with the current RNG state cannot step backwards.

On hardware with AES primitives, it's faster than MT, though slower than pcg64.

spacechild1•1d ago
> I think there are vanishingly few use cases for non-crypto RNGs.

There are quite a few applications:

- audio applications

- computer games

- simulations

- etc.

bhickey•1d ago
Years ago a player figured out how to decode the RNG state in dungeon crawl. This allowed them to essentially God mode the game because they could determine when a monster would land a hit.
Night_Thastus•1d ago
Out of curiosity, how do audio applications use RNG?
magicalhippo•1d ago
Noise. Either directly as a source, which is then filtered to produce say a cymbal sound, or indirectly by creating timing variance, ie notes not playing exactly at the beat each time, which makes things sound less artificial and more pleasing.

It can also be used to dither[1] the result when doing a final rendering, to improve the noise floor.

[1]: https://en.wikipedia.org/wiki/Noise_shaping

Night_Thastus•1d ago
Ah, ok. I was thinking playback. Noise makes total sense. Thanks!
magicalhippo•1d ago
I guess I should have pointed out the noise shaping could be useful if you're doing 16bit playback of a 24bit source, so possibly useful during playback as well.
thomasmg•1d ago
So I'm a bit confused, the paper says "'Randen' ... outperforming ... PCG" but now you wrote "though slower than pcg64"? In the paper itself, PCG is faster but "these microbenchmark results are quite irrelevant in practice — which actual application repeatedly calls a random generator and ignores the results?" I wonder, was the paper peer-reviewed? (I'm not saying it should have been..., just wondering.)

My guess is that yes, Randen is fast, but it is not quite as fast as PCG or other pseudo-random number generators.

Whether or not it is relevant, that's another story. I can not agree to your claim "there are vanishingly few use cases" for insecure random number generators.

jeffbee•1d ago
If you look inside Abseil it seems that the authors generally agree on performance. The default random in Abseil is Randen, but the faster variant the use of which is discouraged is PCG.
bhickey•1d ago
I was citing the micro benchmarks because they provide a pessimistic comparison in favor of other RNGs. My recollection is that it appeared as a conference paper, but I'd need to ask the author.
adastra22•1d ago
Every time you use a hash map you use a (PR)RNG. There are an abundant number of use cases for non-CS RNGs.
spacechild1•1d ago
That would be a very strange hash map :) Could it be that you are confusing hash functions with PRNGs?
adastra22•1d ago
Why do you think they are different? That's what a hash function is--mapping input to outputs with a pseudo-random distribution. Different words for literally the same thing.
spacechild1•1d ago
> That's what a hash function is--mapping input to outputs with a pseudo-random distribution

That's not what a PRNG does. A PRNG produces a sequence of pseudo-random values (with a particular distribution) based on an initial seed value.

A hash function is stateless, a PRNG is not.

adastra22•1d ago
When used as a hasher, the 'seed' is the input. When used as a PRNG tool, the seed is an incrementing counter or something. It's the same function though.
spacechild1•1d ago
I see what you mean. You can use a hash function to build a PRNG, but the hash function itself it not a PRNG.

In a hash table, the hash function is not used to produce a series of random numbers, that's why it's a bit silly to call it a PRNG. Instead, the purpose is to reduce a (possibly larger) input value to a fixed size output value, often with the requirement that similar input values should result in very different output values to avoid clustering.

However, your hash table example is still useful as an analogy.

jeffbee•1d ago
Well, when you use the Abseil hash maps the "random" number is actually a thread-local variable that monotonically increments, and the Abseil hash functions use static "random" data.
bhickey•1d ago
You've made yourself susceptible to hash flooding.
duped•1d ago
I mean the only use case for cryptographically secure RNGs is cryptography. I can't think of non-crypto use cases that need it, but all of those need performance (and determinism is sometimes desirable!).
SideQuark•1d ago
By far the vast majority of RNG calls in use need to be non-crypto. They are so many orders of magnitude too slow for so many uses that it's ludicrous to claim there are " vanishingly few use cases for non-crypto RNGs," unless you're trying to scare people into using your work.

Science, neural networks, simulation, gaming, rendering, weather, nuclear, robotics, signal processing, engineering, finance, and more industries require fast rngs to get billions to trillions of them quickly.

Very few things actually need secure - only the things that need a secure endpoint, and most of those simply use the secure rng to do a private key transfer algo, after which there is no more rngs.

Use the right tool for the job. Widen your view of what things are used for. Etc.

bhickey•6h ago
> They are so many orders of magnitude too slow for so many uses

On hardware with AES primitives this is simply false. Yes, embedded cases are different. There's some neat work on probabilistic computing that uses a xorshift RNG in hardware. These are specialized use cases that probably aren't your use case. Use the right tool for the job and try to be less condescending.

pantalaimon•1d ago
Shouldn't getrandom() be plenty fast these days?
ksherlock•1d ago
That seems to be linux/glibc specific.
zx2c4•1d ago
If you don't need "predictable randomness", like for repeatable statistical simulations, then absolutely, you should only use getrandom(). On recent Linux, this is implemented in the vDSO and is super fast. Few excuses now to use anything different.
wahern•1d ago
The portable API is getentropy, which glibc provides as a simple wrapper around getrandom. getentropy was added to POSIX, and is also available on most modern unix systems, including FreeBSD, Illumos, NetBSD, macOS, OpenBSD, and Solaris.

arc4random has been provided by glibc 2.36 (2022), and is available on all the above-mentioned systems as well. If you don't want to make a syscall per request (outside Linux), just use arc4random; it'll be the fastest method available. musl libc lacks arc4random, unfortunately, but you can always ship a small wrapper.

Systems that support arc4random also support arc4random_uniform, which is a way to get an unbiased unsigned integer between 0 and N (up to 2^32-1). That's probably the most important reason to use the arc4random family.

jeffbee•1d ago
vDSO getrandom has been in the kernel for what, two weeks? And it is only "super fast" compared to the unbelievably slow full syscall. Compared to PCG it is like watching rocks grow.
zx2c4•1d ago
Linus merged it on July 24, 2024, so about a year I guess. Kernel is released ~8 weeks after the merge window, so I suppose September or so.

I think neither are unbelievably slow. I dunno, take some measurements and see, maybe it suits you.

zx2c4•1d ago
Careful with the "chacha csprng" when the seed from the seed() function appears to be 32 or 64 bits. That's not enough for the cs part. (Also the output stream appears to wrap after 2**32 blocks. Could make this larger.)
OskarS•1d ago
I think it looks great! Might be using this in a future project.

One note on API design: I think it's a FANTASTIC idea to have default `rand()` functions available, since very often, you don't really care about the generator and you're just like "just give me a random number, i don't care how". But if you do, you shouldn't have a `seed()` function, because that means you can then never upgrade it without breaking your API contract. It should always be seeded with entropy. This is why glibc's `rand()` is still using an LCG from the late neolithic, they can't ever update it without breaking a gazillion applications and tests. This is the classic example of "Hyrum's law". Doing this also helps with thread-safety: you can just make the seed/state thread-local and then it just works.

Basically, if you want the ability to seed your PRNG, you also need to specify the generator explicitly. The global one is for convenience only, and there should be no ability in the API to seed it.

EDIT: btw, this is a really nice thing about C++, doing this is dead easy:

    int rand() {
        thread_local generator my_generator { seed_with_entropy() };
        return my_generator.rand();
    }
quietbritishjim•1d ago
I forget the details off hand, but I thought that C++ notoriously fails to include much entropy in the generator unless you go through some complex multiline dance that forces the seed sequence to actually include it.
SiempreViernes•1d ago
I think Oskars point is that if you say "I just want a random number, I don't care how" you can't very well be upset the seed is bad as that would be caring.
OskarS•1d ago
I’m assuming here that ”generator” has a decent enough constructor that seeds it properly. If it doesn’t, and you have to cycle it a few times to ”warm it up”, it’s not much more complicated, just factor out the initialization code into a function or lambda and use that to initialize the thread_local.
kccqzy•1d ago
You mean creating the generator with a deterministic seed is shorter code than creating it with entropy? Well yes, because the entropy source is std::random_device so it's an extra line for an extra variable.

Don't think it's a problem in practice because every online tutorial or Stack Overflow posts contain that multi-line snippet for real entropy. C++ programmers are somewhat used to verbosity anyways.

quietbritishjim•18h ago
> You mean creating the generator with a deterministic seed is shorter code than creating it with entropy?

No, it was more like: it's easy to write code that appears to fully seed a RNG with entropy, but in fact only seeds a tiny part of its state. Making it seed the whole lot, especially in a portable way, is surprisingly hard to get right. But I now can't find the article that explains this.

I seem to remember I had to use Boost.Random instead of the C++ standard library to do this in a real program, but this was a few years ago and the latest standard that was available in most implementations was C++11. Perhaps things have improved since.

joiguru•1d ago
Very interesting work!

Did you guys get a chance compare with [1]. These seen to be the standard for high-performance not Crypto RNGs.

[1]: https://github.com/DEShawResearch/random123

spacechild1•1d ago
This looks nice! One thing I find particularly noteworthy:

- Faster uniform / normal distributions that produce same sequences on every platform

This is very useful if you want to reliably repeat simulations across platforms!

One question:

    template<class T>
    constexpr T& rand_choice(std::initializer_list<T> objects) noexcept;

Isn't this returning a reference to a local variable? In my understanding, 'objects' is destroyed when rand_choice() returns.
GeorgeHaldane•1d ago
Thanks, that is indeed the case when list doesn't consist of literals, fixed in the new commit.
jll29•1d ago
Thanks for sharing, this is a very well-written and useful set of libraries, not just random, but also the other sub-libraries of utl.

One caveat:

> Note 2: If no hardware randomness is available, std::random_device falls back onto an internal PRNG ....

> ...

> std::random_device has a critical deficiency in it's design — in case its implementation doesn't provide a proper source of entropy, it is free to fallback onto a regular PRNGs that don't change from run to run. The method std::random_device::entropy() which should be able to detect that information is notoriously unreliable and returns different things on every platform.

> entropy() samples several sources of entropy (including the std::random_device itself) and is (almost) guaranteed to change from run to run even if it can't provide a proper hardware-sourced entropy that would be suitable for cryptography.

Personally, I think it would be best if there was a way to communicate to the system (or here, to the library in specific) what is the use case. For cryptographic applications, I don't want the library to fall back gracefully to something insecure; I would want a dark red critical error message and immediate termination with an "insufficient entropy error" error code.

However, for a game graceful degradation might be quite okay, because nobody is going to die in the real world if a monster behaves a little less random.

I learned a lot about recent advances in pseudo-random number generators by reading your code and associated documentation, including some stuff that DEK has yet to incorporate into volume 2 of TAOCP. ;)

nzzn•1d ago
For a more full featured Xoshiro/Xoroshiro implementation see

https://github.com/nessan/xoshiro

Documented at: https://nessan.github.io/xoshiro/

Handles the full family of these generators featuring arbitrary jump sizes, stream partitioning for parallel applications, and, like this library, a number of convenience sampling methods to shield the casual user from the complexities of using <random>.

Can be used with a companion `bit` library (https://github.com/nessan/bit) that performs efficient linear algebra and polynomial reduction over GF(2) for those that want to explore some of the mathematics behind these linear generators.

zX41ZdbW•1d ago
Nice trick:

> How is it faster than std: It uses the fact that popcount of a uniformly distributed integer follows a binomial distribution. By rescaling that binomial distribution and adding some linear fill we can achieve a curve very similar to a proper normal distribution in just a few instructions. While that level of precision is not suitable for a general use, in instances where quality is not particularly important (gamedev, fuzzing) this is perhaps the fastest possible way of generating normally distributed floats.

Iwan-Zotow•1d ago
Std::minstd at 100pct and std::mt19997 at 105pct?!?

Not quite believable...

Big Tech's AI Endgame Is Coming into Focus (an everything app)

https://www.theatlantic.com/technology/archive/2025/06/everything-app-big-tech-ai-endgame/683024/
3•petethomas•6m ago•0 comments

FFmpeg Merges WebRTC Support

https://github.com/FFmpeg/FFmpeg/commit/167e343bbe75515a80db8ee72ffa0c607c944a00
2•Sean-Der•7m ago•0 comments

Friendship rather than romance protects better from depression

https://www.psychologytoday.com/au/blog/living-single/202505/which-protects-best-from-depression-friendship-or-romance
1•nreece•9m ago•0 comments

Malicious RubyGems pose as Fastlane to steal Telegram API data

https://www.bleepingcomputer.com/news/security/malicious-rubygems-pose-as-fastlane-to-steal-telegram-api-data/
1•feross•11m ago•0 comments

Lodestar Multipliers in Delaware and Federal Attorney Fee Awards

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5237545
1•ckrailo•16m ago•0 comments

Installing *BSD in 2025 part 3 – A critical look at NetBSD's installer

https://eerielinux.wordpress.com/2025/05/31/installing-bsd-in-2025-part-3-a-critical-look-at-netbsds-installer/
1•jaypatelani•16m ago•0 comments

Show HN: Hacker News historic upvote and score data

https://hn.dunkirk.sh/
4•clacker-o-matic•17m ago•0 comments

AI can't solve novel problems yet

https://jamesoclaire.com/2025/06/04/ai-obviously-cant-yet-solve-novel-problems/
3•ddxv•25m ago•4 comments

Designing Algorithmic Delegates

https://arxiv.org/abs/2506.03102
2•MarcoDewey•30m ago•0 comments

Our Startup Was Hacked, Need GitHub's Assistance to Trace Attacker

https://techcrunch.com/2025/06/03/indian-grocery-startup-kiranapro-was-hacked-and-its-servers-deleted-ceo-confirms/
4•deepakravindran•36m ago•2 comments

Merlin Bird ID

https://merlin.allaboutbirds.org/
4•twitchard•37m ago•1 comments

Binary Wordle

https://wordle.chengeric.com/
1•eh8•38m ago•0 comments

The symbolism of the magnifying glass is not universal

https://devblogs.microsoft.com/oldnewthing/20250603-00/?p=111240
3•paulmooreparks•39m ago•1 comments

Google Scholar is Manipulatable (2024)

https://arxiv.org/abs/2402.04607
2•downboots•47m ago•0 comments

'Spiderweb' drone attack marks a new threat for top militaries

https://www.businessinsider.com/operation-spiderweb-5-ways-ukraine-drone-attack-new-era-warfare-2025-6
4•petethomas•48m ago•0 comments

Open Sesame! on the Security and Memorability of Verbal Passwords [pdf]

https://seclab.skku.edu/wp-content/uploads/2025/05/223600a683.pdf
1•grac3•49m ago•0 comments

Asian American genius got rejected from 16 colleges because of racism [video]

https://www.youtube.com/watch?v=wl7t3QiXYOI
4•robomartin•50m ago•1 comments

Chinese couple charged with smuggling a biological pathogen into the U.S.

https://www.nbcnews.com/politics/justice-department/chinese-couple-charged-smuggling-biological-pathogen-us-rcna208658
5•shinryudbz•53m ago•1 comments

How NATO is turning to startups to outpace its rivals

https://thenextweb.com/news/how-nato-startups-fight-future-wars
1•mikece•55m ago•0 comments

DiffX – Next-Generation Extensible Diff Format

https://diffx.org/
17•todsacerdoti•57m ago•0 comments

Flesh-eating New World Screwworm could pose health risks to cattle, humans

https://www.foxnews.com/health/flesh-eating-new-world-screwworm-could-pose-health-risks-cattle-humans
1•keepamovin•57m ago•1 comments

Why is PS3 emulation so fast: RPCS3 optimizations explained [video]

https://www.youtube.com/watch?v=19ae5Mq2lJE
5•alexjplant•59m ago•0 comments

Musk calls Trump's tax bill a 'disgusting abomination'

https://www.bbc.com/news/articles/c0j76djzgpvo
7•andsoitis•1h ago•2 comments

Ask HN: Stripe and Chargebacks

2•gtech1•1h ago•2 comments

Meta and Yandex exfiltrating tracking data on Android via WebRTC

https://arstechnica.com/security/2025/06/meta-and-yandex-are-de-anonymizing-android-users-web-browsing-identifiers/
2•liuandrewk•1h ago•1 comments

Indeed CEO resigns after major growth to focus on AI ethics

https://www.techinasia.com/news/indeed-ceo-resigns-after-major-growth-to-focus-on-ai-ethics
2•williswee•1h ago•0 comments

Why Is the US Dropping Billions of Mutant Flies from the Sky? [video]

https://www.youtube.com/watch?v=zxq60I5RSW8
4•keepamovin•1h ago•1 comments

Out of His League and Clueless: NIH Staffers Speak Out on Director Bhattacharya

https://www.importantcontext.news/p/out-of-his-depth-sold-his-soul-clueless
22•SubiculumCode•1h ago•2 comments

Show HN: Built an AI Agent that finds hidden bugs and maps web apps

https://testchimp.io/blog/agentic-exploratory-testing/
4•TestChimp•1h ago•0 comments

Show HN: All-in-one platform for AI image generation

https://www.imageninja.ai/
1•ashr_•1h ago•0 comments