205
votes

I've been using Random (java.util.Random) to shuffle a deck of 52 cards. There are 52! (8.0658175e+67) possibilities. Yet, I've found out that the seed for java.util.Random is a long, which is much smaller at 2^64 (1.8446744e+19).

From here, I'm suspicious whether java.util.Random is really that random; is it actually capable of generating all 52! possibilities?

If not, how can I reliably generate a better random sequence that can produce all 52! possibilities?

9
"how can I surely generate a real random number over 52!" The numbers from Random are never real random numbers. It's a PRNG, where P stands for "pseudo." For real random numbers, you need a source of randomness (such as random.org).T.J. Crowder
@JimGarrison That is not what OP's after. He's talking about 10^68 possible sequences. Since each pseudo-random sequence is identified by its seed, OP says there could be at the most 2^64 different sequences.Sergey Kalinichenko
I think it's an interesting question, and is worth thinking about. But I can't help wondering about your problem context: what it is exactly that's leading to the requirement to be able to generate all 52! permutations? For example, in real-world bridge we can shuffle the deck and deal one card at a time, yet there are only ~6e11 different hands since many different permutations result in the same hand. Thinking in the other direction, do you need a solution specifically for 52!, or do you need one that generalizes to, say, two decks shuffled together (104!/(2**52) possibilities, or ~2e150)?NPE
@NPE - Take Solitaire (Klondike) for instance, 52! is exactly the number of possible hands..Sergei Emelianov
I think this is an interesting read: superuser.com/a/712583Dennis_E

9 Answers

153
votes

Selecting a random permutation requires simultaneously more and less randomness than what your question implies. Let me explain.

The bad news: need more randomness.

The fundamental flaw in your approach is that it's trying to choose between ~2226 possibilities using 64 bits of entropy (the random seed). To fairly choose between ~2226 possibilities you're going to have to find a way to generate 226 bits of entropy instead of 64.

There are several ways to generate random bits: dedicated hardware, CPU instructions, OS interfaces, online services. There is already an implicit assumption in your question that you can somehow generate 64 bits, so just do whatever you were going to do, only four times, and donate the excess bits to charity. :)

The good news: need less randomness.

Once you have those 226 random bits, the rest can be done deterministically and so the properties of java.util.Random can be made irrelevant. Here is how.

Let's say we generate all 52! permutations (bear with me) and sort them lexicographically.

To choose one of the permutations all we need is a single random integer between 0 and 52!-1. That integer is our 226 bits of entropy. We'll use it as an index into our sorted list of permutations. If the random index is uniformly distributed, not only are you guaranteed that all permutations can be chosen, they will be chosen equiprobably (which is a stronger guarantee than what the question is asking).

Now, you don't actually need to generate all those permutations. You can produce one directly, given its randomly chosen position in our hypothetical sorted list. This can be done in O(n2) time using the Lehmer[1] code (also see numbering permutations and factoriadic number system). The n here is the size of your deck, i.e. 52.

There is a C implementation in this StackOverflow answer. There are several integer variables there that would overflow for n=52, but luckily in Java you can use java.math.BigInteger. The rest of the computations can be transcribed almost as-is:

public static int[] shuffle(int n, BigInteger random_index) {
    int[] perm = new int[n];
    BigInteger[] fact = new BigInteger[n];
    fact[0] = BigInteger.ONE;
    for (int k = 1; k < n; ++k) {
        fact[k] = fact[k - 1].multiply(BigInteger.valueOf(k));
    }

    // compute factorial code
    for (int k = 0; k < n; ++k) {
        BigInteger[] divmod = random_index.divideAndRemainder(fact[n - 1 - k]);
        perm[k] = divmod[0].intValue();
        random_index = divmod[1];
    }

    // readjust values to obtain the permutation
    // start from the end and check if preceding values are lower
    for (int k = n - 1; k > 0; --k) {
        for (int j = k - 1; j >= 0; --j) {
            if (perm[j] <= perm[k]) {
                perm[k]++;
            }
        }
    }

    return perm;
}

public static void main (String[] args) {
    System.out.printf("%s\n", Arrays.toString(
        shuffle(52, new BigInteger(
            "7890123456789012345678901234567890123456789012345678901234567890"))));
}

[1] Not to be confused with Lehrer. :)

60
votes

Your analysis is correct: seeding a pseudo-random number generator with any specific seed must yield the same sequence after a shuffle, limiting the number of permutations that you could obtain to 264. This assertion is easy to verify experimentally by calling Collection.shuffle twice, passing a Random object initialized with the same seed, and observing that the two random shuffles are identical.

A solution to this, then, is to use a random number generator that allows for a larger seed. Java provides SecureRandom class that could be initialized with byte[] array of virtually unlimited size. You could then pass an instance of SecureRandom to Collections.shuffle to complete the task:

byte seed[] = new byte[...];
Random rnd = new SecureRandom(seed);
Collections.shuffle(deck, rnd);
26
votes

In general, a pseudorandom number generator (PRNG) can't choose from among all permutations of a 52-item list if its maximum cycle length is less than 226 bits.

java.util.Random implements an algorithm with a modulus of 248 and a maximum cycle length of only 48 bits, so much less than the 226 bits I referred to. You will need to use another PRNG with a bigger cycle length, specifically one with a maximum cycle length of 52 factorial or greater.

See also "Shuffling" in my article on random number generators.

This consideration is independent of the nature of the PRNG; it applies equally to cryptographic and noncryptographic PRNGs (of course, noncryptographic PRNGs are inappropriate whenever information security is involved).


Although java.security.SecureRandom allows seeds of unlimited length to be passed in, the SecureRandom implementation could use an underlying PRNG (e.g., "SHA1PRNG" or "DRBG"). And it depends on that PRNG's maximum cycle length whether it's capable of choosing from among 52 factorial permutations.

18
votes

Let me apologize in advance, because this is a little tough to understand...

First of all, you already know that java.util.Random is not completely random at all. It generates sequences in a perfectly predictable way from the seed. You are completely correct that, since the seed is only 64 bits long, it can only generate 2^64 different sequences. If you were to somehow generate 64 real random bits and use them to select a seed, you could not use that seed to randomly choose between all of the 52! possible sequences with equal probability.

However, this fact is of no consequence as long as you're not actually going to generate more than 2^64 sequences, as long as there is nothing 'special' or 'noticeably special' about the 2^64 sequences that it can generate.

Lets say you had a much better PRNG that used 1000-bit seeds. Imagine you had two ways to initialize it -- one way would initialize it using the whole seed, and one way would hash the seed down to 64 bits before initializing it.

If you didn't know which initializer was which, could you write any kind of test to distinguish them? Unless you were (un)lucky enough to end up initializing the bad one with the same 64 bits twice, then the answer is no. You could not distinguish between the two initializers without some detailed knowledge of some weakness in the specific PRNG implementation.

Alternatively, imagine that the Random class had an array of 2^64 sequences that were selected completely and random at some time in the distant past, and that the seed was just an index into this array.

So the fact that Random uses only 64 bits for its seed is actually not necessarily a problem statistically, as long as there is no significant chance that you will use the same seed twice.

Of course, for cryptographic purposes, a 64 bit seed is just not enough, because getting a system to use the same seed twice is computationally feasible.

EDIT:

I should add that, even though all of the above is correct, that the actual implementation of java.util.Random is not awesome. If you are writing a card game, maybe use the MessageDigest API to generate the SHA-256 hash of "MyGameName"+System.currentTimeMillis(), and use those bits to shuffle the deck. By the above argument, as long as your users are not really gambling, you don't have to worry that currentTimeMillis returns a long. If your users are really gambling, then use SecureRandom with no seed.

10
votes

I'm going to take a bit of a different tack on this. You're right on your assumptions - your PRNG isn't going to be able to hit all 52! possibilities.

The question is: what's the scale of your card game?

If you're making a simple klondike-style game? Then you definitely don't need all 52! possibilities. Instead, look at it like this: a player will have 18 quintillion distinct games. Even accounting for the 'Birthday Problem', they'd have to play billions of hands before they'd run into the first duplicate game.

If you're making a monte-carlo simulation? Then you're probably okay. You might have to deal with artifacts due to the 'P' in PRNG, but you're probably not going to run into problems simply due to a low seed space (again, you're looking at quintillions of unique possibilities.) On the flip side, if you're working with large iteration count, then, yeah, your low seed space might be a deal-breaker.

If you're making a multiplayer card game, particularly if there's money on the line? Then you're going to need to do some googling on how the online poker sites handled the same problem you're asking about. Because while the low seed space issue isn't noticeable to the average player, it is exploitable if it's worth the time investment. (The poker sites all went through a phase where their PRNGs were 'hacked', letting someone see the hole cards of all the other players, simply by deducing the seed from exposed cards.) If this is the situation you're in, don't simply find a better PRNG - you'll need to treat it as seriously as a Crypto problem.

9
votes

Short solution which is essentially the same of dasblinkenlight:

// Java 7
SecureRandom random = new SecureRandom();
// Java 8
SecureRandom random = SecureRandom.getInstanceStrong();

Collections.shuffle(deck, random);

You don't need to worry about the internal state. Long explanation why:

When you create a SecureRandom instance this way, it accesses an OS specific true random number generator. This is either an entropy pool where values are accessed which contain random bits (e.g. for a nanosecond timer the nanosecond precision is essentially random) or an internal hardware number generator.

This input (!) which may still contain spurious traces are fed into a cryptographically strong hash which removes those traces. That is the reason those CSPRNGs are used, not for creating those numbers themselves! The SecureRandom has a counter which traces how many bits were used (getBytes(), getLong() etc.) and refills the SecureRandom with entropy bits when necessary.

In short: Simply forget objections and use SecureRandom as true random number generator.

4
votes

If you consider the number as just an array of bits (or bytes) then maybe you could use the (Secure)Random.nextBytes solutions suggested in this Stack Overflow question, and then map the array into a new BigInteger(byte[]).

3
votes

A very simple algorithm is to apply SHA-256 to a sequence of integers incrementing from 0 upwards. (A salt can be appended if desired to "get a different sequence".) If we assume that the output of SHA-256 is "as good as" uniformly distributed integers between 0 and 2256 - 1 then we have enough entropy for the task.

To get a permutation from the output of SHA256 (when expressed as an integer) one simply needs to reduce it modulo 52, 51, 50... as in this pseudocode:

deck = [0..52]
shuffled = []
r = SHA256(i)

while deck.size > 0:
    pick = r % deck.size
    r = floor(r / deck.size)

    shuffled.append(deck[pick])
    delete deck[pick]
1
votes

My Empirical research results are Java.Random is not totally truly random. If you try yourself by using Random class "nextGaussian()"-method and generate enough big sample population for numbers between -1 and 1, the graph is normal distbruted field know as Gaussian Model.

Finnish goverment owned gambling-bookmarker have a once per day whole year around every day drawn lottery-game where winning table shows that the Bookmarker gives winnings in normal distrbuted way. My Java Simulation with 5 million draws shows me that with nextInt() -methdod used number draw, winnings are normally distributed same kind of like the my Bookmarker deals the winnings in each draw.

My best picks are avoiding numbers 3 and 7 in each of ending ones and that's true that they are rarely in winning results. Couple of times won five out of five picks by avoiding 3 and 7 numbers in ones column in Integer between 1-70 (Keno).

Finnish Lottery drawn once per week Saturday evenings If you play System with 12 numbers out of 39, perhaps you get 5 or 6 right picks in your coupon by avoiding 3 and 7 values.

Finnish Lottery have numbers 1-40 to choose and it takes 4 coupon to cover all the nnumbers with 12 number system. The total cost is 240 euros and in long term it's too expensive for the regural gambler to play without going broke. Even if you share coupons to other customers available to buy still you have to be quite a lucky if you want to make profit.