2
votes

I am trying to implement my own version of the Random class from the python standard library. I can generate random bits and I implemented the getrandbits(n) function. But the superclass does not use this function to calculate the returned float from random(). So I have to implement that by myself:

def random(self):
    exp = 0x3FF0000000000000
    mant = self.getrandbits(52)
    return struct.unpack("d", struct.pack("q", exp^mant))[0]-1.0

I am using a sign of 0 (positive), an exponent of 1023 (2^0 = 1) and a random mantisse. So I get a number from [1.0, 2.0). The random() function must return a number in [0.0, 1.0) so I subtract 1.0 before returning. As I'm not an expert when it comes to floating point numbers I'm not sure this is done the right way. Don't I lose precision by subtracting? Can I build the number from random bits so that it's in [0.0, 1.0) without subtraction?

1
Sorry, but this sounds like a really bad idea to me. Ask yourself: "What would Donald Knuth do?" and run the Diehard tests on your final solution.duffymo
This is not actually about the random generator. This can assumed to be correct (as I am wrapping the python class around an existing and tested generator). The question is, how do I assemble floating point randoms according to the python standardlibrary from the random bits.qasfux

1 Answers

2
votes

Your implementation is fine: assuming that getrandbits itself is random enough, your implementation will produce each number of the form n / 2^52 for 0 <= n < 2^52 with equal probability, so it's a good approximation to the uniform distribution on [0, 1). The subtraction you use is not a problem: the result from that subtraction is always exactly representable, so there's no rounding or precision loss involved in the subtraction.

Python's random() implementation does something more along the lines of return self.getrandbits(53) / 2**53. The effects are similar, except that the distribution of outputs is now twice as fine: you get each number of the form n / 2^53 for 0 <= n < 2^53 with equal probability. It's unlikely that most applications would notice the difference between these two implementations in practice. If you care about speed, it's likely that this is faster, too, though as always you should profile to find out whether that's actually the case.

Neither of these is perfect: there are around 2^62 distinct IEEE 754 binary64 floats in the range [0.0, 1.0), and your implementation can only generate 2^52 distinct outputs, so most of those floats can never be generated by either of the above implementations. A better random() implementation could generate each floating-point number x in the range [0.0, 1.0] with a probability equal to the length of the subinterval of [0.0, 1.0) that rounds to x under some form of round-to-nearest. However, such an implementation would be significantly more complicated (though not particularly hard to implement), and few applications would benefit from the larger output set. As the Zen of Python says: "Practicality beats purity."


EDIT: To illustrate the last paragraph above, here's some code. The uniform function uses getrandbits to generate uniformly distributed floats on [0, 1] according to the above description.

"""                                                                                                                                                                                                                                         
High quality uniform random variable on [0, 1].                                                                                                                                                                                             

Simulates round(X) where X is a real random variable uniformly distributed on                                                                                                                                                               
the interval [0, 1] and round is the usual round-to-nearest rounding function                                                                                                                                                               
from real numbers to floating-point.                                                                                                                                                                                                        

"""
from __future__ import division
import random

precision = 53
emin = -1021

def random_significand():
    return (random.getrandbits(precision) + 1) // 2 / (2**precision)

def uniform():
    for i in xrange(1 - emin):
        if random.getrandbits(1):
            return (random_significand() + 0.5) / 2**i
    # Subnormal                                                                                                                                                                                                                             
    return random_significand() / 2**i