Why not just use random_device?
This question is actually a very good one.
The answer is - of course you can just use std::random_device exactly like you written in your example. It is totally legitimate and correct use of std::random_device
- and any distribution may be used on top of it just like with any other random engine. If you don't need Pseudo-Random Number Generator (PRNG) like std::mt19937 or any other - just don't use it. That's it.
Mantra repeated by many people a la - "std::random_device is just for seeding blah-blah-blah" is a random BS (pun intended) that have nothing to do with the meaning and purpose of std::random_device
. Sure though std::random_device
can be used as a PRNG seed - just like any other source of random information.
Having said that - whether you actually should just use std::random_device
instead of a good PRNG depends entirely on your application needs - some details written below.
You should view any PRNG as a mathematical function that takes limited size input sequence of bits and produces a very long output sequence of numbers with some (typically uniform) distribution. If you pass the same input bits to the same PRNG twice - you will get the same output sequence. Just like if you use the same value of x
to compute std::sin(x)
twice - you will get exactly the same sine value returned. That's why if you need to avoid repeating the same PRNG output number sequence each time - its input bits (seed) must be different each time. Obviously since PRNG operation requires nothing more than some computations - it is usually local and fast - no system calls, no external devices involved, no blocking while waiting for something, no exceptions thrown - instant result and high rate of numbers generated that scales easily when CPU performance increases.
std::random_device
on the other hand is the first attempt to introduce actual random number generator in C++ standard library.
Quote from C++ standard (ISO/IEC 14882-2017):
29.6.6 Class random_device
- A random_device uniform random bit generator produces nondeterministic random numbers.
- If implementation limitations prevent generating nondeterministic random numbers, the implementation may employ a random number engine.
^^^ This quote is quite funny because (1) and (2) above entirely contradict each other. std::random_device
either produces nondeterministic random numbers OR its implementation prevents it - both cannot possibly be true at the same time. But word "if" and "may" are present only in (2) - so the only possible non-contradictory understanding of the quote above is that "if" from (2) is never realized and every implementation just produces nondeterministic random numbers - i.e. complies with (1).
Let's just assume that standard compliant std::random_device
simply produces uniformly distributed sequence of random and independent bits. If we are super optimistic we can even hope to get cryptographically secure random numbers - even though C++ standard doesn't guarantee or even promise anything like that. Good news is that modern implementations actually provide it - both typical /dev/urandom
UNIX implementation and Win32 Crypto API implementation should be secure enough. Without being cryptographically secure std::random_device
is not very useful tool anyway.
Especially since according to the C++ standard:
result_type operator()();
6 Returns: A nondeterministic random value, uniformly distributed between min() and max(), inclusive.
It is implementation-defined how these values are generated.
^^^ thus if we really need to - we may somewhat limit the portability of the application to only those implementations that produce cryptographically secure output of std::random_device::operator()()
- as this is well defined and documented individually for each particular implementation (that's what implementation-defined means BTW). Of course if we don't need some strict requirements like secure random numbers we shouldn't limit the portability.
Obviously nondeterministic sequence of uniformly distributed and independent random bits (AKA true random numbers) cannot be produced without external source of information (external source of randomness) - like some sensor signal noise or precisely measured times of some external events - anything external and inherently irregular. (By external I mean that information comes from outside media - but the sensor itself may be integrated to CPU/SoC). Any such source of external randomness then typically filtered to remove detectable regularities to assure compliance with a requirement of uniformly distributed and independent bit sequence output. All this greatly limits data rate produced and creates a possibility of failure and/or blocking while waiting for new external data.
So let's now weigh pros and cons of PRNG sequence VS true random numbers for different kinds of applications.
If application requires random number generation for information security purposes - passwords, encryption keys, salt values, security tokens, etc. then there is no question - of C++ standard features only std::random_device
and not any compliant one (let alone non-compliant ones) but only those that provide cryptographically secure implementations. Some PRNGs may be used for information security purposes too but only special class of secure PRNG and only if they are carefully seeded with secure random seed of enough size (enough entropy) - so you need true random numbers anyway to produce a seed. As of now - no PRNG engine from C++ standard library is cryptographically secure. If you don't trust std::random_device
to be secure (for example - you don't want to limit portability to suitable implementations only or you want to avoid the effort of checking implementation suitability for each supported platform after each update) then only non-standard trusted third party solution can be used safely - e.g. Win32 crypto API directly or UNIX /dev/random or /dev/urandom directly or some other non-standard solution - whatever is trustworthy enough for you.
Other sensitive applications where non-predictability of random numbers is very important - like online casinos, online betting, stock market trading, etc. - also probably require cryptographically secure random numbers - so all the considerations from (1) apply here too.
For most other applications - like usual games or scientific simulations or anything where money or security not involved and thus potential predictability of random number sequence cannot harm - typical good quality PRNG is sufficient. While it may be fine to still use std::random_device
for many of these applications but only if performance (rate and delay of generation) is not important. In many cases performance is actually of major importance - e.g. for scientific simulations or real-time noise simulation (for computer graphics or sound effects, etc.) - so sometimes true random numbers are unsuitable for performance reasons.
Also there are applications where PRNG is fundamentally necessary - e.g. some games may generate maps/worlds/levels on the fly using PRNG with a fixed seed value to avoid storing them to save disk space (this was a popular trick on early computers with very little RAM and storage space but still used in some modern games too). Another example is noise substitution phase of audio/video compression algorithms - where actual background noise is replaced with PRNG-generated pseudo-noise of the same magnitude and spectral characteristics as the original noise to store just a seed to a compressed bitstream instead of lots of actual uncompressible random information.
One last note:
If you don't need secure random numbers and don't want to rely on quality or even standard compliance of std::random_device
implementation then using it alone to generate PRNG seed is also a bad idea. You should put more randomness into the mix - e.g. combine std::random_device
output with current time of maximum available precision (microsecond/nanosecond/whatever is available) and if available add some other external sensors' readings too (e.g. raw gyro sensor readings or audio mic readings or raw image sensor readings - anything external and noisy).
For example.
Instead of using this:
std::mt19937::result_type seed = std::random_device()();
std::mt19937 gen(seed);
Better use something like this:
std::mt19937::result_type seed = std::random_device()()
^ std::chrono::duration_cast<std::chrono::seconds>(
std::chrono::system_clock::now().time_since_epoch()
).count()
^ std::chrono::duration_cast<std::chrono::microseconds>(
std::chrono::high_resolution_clock::now().time_since_epoch()
).count()
/* ^ more_external_random_stuff */ ;
std::mt19937 gen(seed);
You may also initialize full state seed sequence of std::mt19937::state_size
(=624) 32-bit numbers:
std::random_device rd;
std::array< std::uint32_t, std::mt19937::state_size > seed_array;
for( auto it = seed_array.begin(); it != seed_array.end(); ++it )
{
// read from std::random_device
*it = rd();
// mix with a C++ equivalent of time(NULL) - UNIX time in seconds
*it ^= std::chrono::duration_cast<std::chrono::seconds>(
std::chrono::system_clock::now().time_since_epoch()
).count();
// mix with a high precision time in microseconds
*it ^= std::chrono::duration_cast<std::chrono::microseconds>(
std::chrono::high_resolution_clock::now().time_since_epoch()
).count();
//*it ^= more_external_random_stuff;
}
std::seed_seq sseq( seed_array.cbegin(), seed_array.cend() );
std::mt19937 gen(sseq);