Why not just use random_device?

Question

I am a bit confused about the c++11 random library.

What I understand: we need two separate concepts:

random engine (which can be pseudo (need seed) or real)
distribution: it maps the numbers obtained from the engine to a specific interval, using a specific distribution.

What I don't understand is why not just use this:

std::random_device rd;
std::uniform_int_distribution<int> dist(1, 5);

// get random numbers with:
dist(rd);

As far as I can tell this works well.

Instead, this is what I found on most examples/sites/articles:

std::random_device rd;
std::mt19937 e{rd()}; // or std::default_random_engine e{rd()};
std::uniform_int_distribution<int> dist{1, 5};

// get random numbers with:
dist(e);

I am not talking about special use, e.g. cryptography, just your basic getting started articles.

My suspicion is because std::mt19937 (or std::default_random_engine) accepts a seed, it can be easier to debug by providing the same seed during a debug session.

Also, why not just:

std::mt19937 e{std::random_device{}()};

"the performance of many implementations of random_device degrades sharply once the entropy pool is exhausted. For practical use random_device is generally only used to seed a PRNG such as mt19937" source — Michael
Well, an answer might want to go into more detail (e.g. why the performance degrades). Hence why I posted it as a comment. — Michael
Regarding your last question; that is exactly how I frequently spin up a prng for many non-cryptographic-mandated purposes. — WhozCraig

Chris Beck Chris Beck · Accepted Answer · 2016-09-02T23:12:27

Also, why not just:

std::mt19937 e{std::random_device{}()};

It might be fine if you only will do this once, but if you will do it many times, it's better to keep track of your std::random_device and not create / destroy it unnecessarily.

It may be helpful to look at the libc++ source code for implementation of std::random_device, which is quite simple. It's just a thin wrapper over std::fopen("/dev/urandom"). So each time you create a std::random_device you are getting another filesystem handle, and pay all associated costs.

On windows, as I understand, std::random_device represents some call to a microsoft crypto API, so you are going to be initializing and destroying some crypto library interface everytime you do this.

It depends on your application, but for general purposes I wouldn't think of this overhead as always negligible. Sometimes it is, and then this is great.

I guess this ties back into your first question:

Instead, this is what I found on most examples/sites/articles:

 std::random_device rd;
 std::mt19937 e{rd()}; // or std::default_random_engine e{rd()};
 std::uniform_int_distribution<int> dist{1, 5};

At least the way I think about it is:

std::mt19937 is a very simple and reliable random generator. It is self-contained and will live entirely in your process, not calling out to the OS or anything else. The implementation is mandated by the standard, and at least in boost, it used the same code everywhere, derived from the original mt19937 paper. This code is very stable and it's cross-platform. You can be pretty confident that initializing it, querying from it, etc. is going to compile to similar code on any platform that you compile it on, and that you will get similar performance.
std::random_device by contrast is pretty opaque. You don't really know exactly what it is, what it's going to do, or how efficient it will be. You don't even know if it can actually be acquired -- it might throw an exception when you attempt to create it. You know that it doesn't require a seed. You're not usually supposed to pull tons and tons of data from it, just use it to generate seeds. Sometimes, it acts as a nice interface to cryptographic APIs, but it's not actually required to do that and sadly sometimes it doesn't. It might correspond to /dev/random on unix, it might correspond to /dev/urandom/. It might correspond to some MSVC crypto API (visual studio), or it might just be a fixed constant (mingw). If you cross-compile for some phone, who knows what it will do. (And even when you do get /dev/random, you still have the problem that performance may not be consistent -- it may appear to work great, until the entropy pool runs out, and then it runs slow as a dog.)

The way I think about it is, std::random_device is supposed to be like an improved version of seeding with time(NULL) -- that's a low bar, because time(NULL) is a pretty crappy seed all things considered. I usually use it where I would have used time(NULL) to generate a seed, back in the day. I don't really consider it all that useful outside of that.

Why not just use random_device?

3 Answers