3
votes

I am using mt19937 to generate a random string from a given seed like this:

std::string StringUtils::randstring(size_t length, uint64_t seed) {
    static auto& chrs = "abcdefghijklmnopqrstuvwxyz";

    thread_local static std::mt19937 rg(seed);
    thread_local static std::uniform_int_distribution<std::string::size_type> pick(0, sizeof(chrs) - 2);

    std::string s;
    s.reserve(length);

    while(length--) {
        s += chrs[pick(rg)];
    }

    return s;
}

I want to guarantee that the sequence of random numbers (and hence the random string generated) is the same across different machines of the same architecture which should be the case as per the answers to this question.

However, when I rebuild the binary (without changing any dependency or library), the random number sequence changes for the same seed (compared to the sequence generated from the previous build with the same seed).

How do I generate a guaranteed sequence of random numbers from a given seed across different binaries on the same machine architecture+image (x86_64 Linux)?

1
Are you saying that the generator returns different numbers on your platforms?Bathsheba
On the same binary, the sequence is the same on both machines A and B. But when I rebuild the binary and update it on machine A, sequence generated by A and B are different.jeffreyveon
I know what you're saying now. You're asking too much of the generator. MT requires 19937 bits of "state", and the seed you supply only gives 64 bits of that. There's lots of stuff out there on how to correctly seed MT19937 - an answer is beyond my pay grade I'm afraid.Bathsheba
Some useful stuff here. Although some of the upvoted answers are awful: stackoverflow.com/questions/45069219/…Bathsheba

1 Answers

2
votes

If reproducible "random" numbers are something you care about, you should avoid C++ distributions, including uniform_int_distribution, and instead rely on your own way to transform the pseudorandom numbers from mt19937 into the numbers you desire. (For example, I give ways to do so for uniform integers. Note that there are other things to consider when reproducibility is important.)

C++ distribution classes, such as uniform_int_distribution, have no standard implementation. As a result, these distribution classes can be implemented differently in different implementations of the C++ standard library. Note that it's not the "compiler", the "operating system", or the "architecture" that decides which algorithm is used. See also this question.

On the other hand, random engines such as mt19937 do have a guaranteed implementation; they will return the same pseudorandom numbers for the same seed in all compliant C++ library implementations (including those of different "architectures"). The exception is default_random_engine.