145
votes

In a small application written in C/C++, I am facing a problem with the rand function and maybe the seed :

I want to produce a sequence of random numbers that are of different orders, i.e. with different logarithm values (base 2). But it seems that all the numbers produced are of the same order, fluctuating just between 2^25 and 2^30.

Is it because rand() is seeded with Unix time which is by now a relatively big number? What am I forgetting ? I am seeding rand() only once at the beginning of the main().

9
FWIW so, is it C or C++? If by C/C++ you mean you can actually use C++, and the mention of C was just random, maybe this en.cppreference.com/w/cpp/numeric/random/binomial_distribution can help.R. Martinho Fernandes
Unfortunately you were betting on the wrong horse. Seed should not be your problem. Your problem was wrong expected distribution. Since unbiased programer would expect rand() to return uniformly distributed numbers (documentation with high Google ranking explicitly says so) I don't think this question is useful for future readers. That's why down vote but don't let it discourage you from using SO.Emperor Orionii
@doug65536 "...where no number is ever repeated" -- that's not random! I could fund my retirement at the craps table if my rand() dice never returned the same number twice until every possible number was returned.Chris Gregg
@GalacticCowboy Don't mistake periodicity with a repeat of individual numbers. From the Wikipedia article you cited: "a repeated result does not imply that the end of the period has been reached, since its internal state may be larger than its output." It would be very, very bad if a PRNG produced a value and then was guaranteed not to produce that value again until all values were returned.Chris Gregg
Doug65536, nobody is picking fights. They are just stating correctly that you are wrong. A PRNG could quite happily churn out the following if I wanted a RAND between 1 and 10: 2 4 7 2 8 1 5 9 7 3 That would be entirely valid, despite the multiple 2s and 7s. I think you are getting the PRNG confused with the shuffle facility on your iPhone.Relaxing In Cyprus

9 Answers

479
votes

There are only 3% of numbers between 1 and 230 which are NOT between 225 and 230. So, this sounds pretty normal :)

Because 225 / 230 = 2-5 = 1/32 = 0.03125 = 3.125%

271
votes

The lighter green is the region between 0 and 225; the darker green is the region between 225 and 230. The ticks are powers of 2.

distribution

42
votes

You need to be more precise: you want different base 2 logarithm values but what distribution do you want for this? The standard rand() functions generate a uniform distribution, you will need to transform this output using the quantile function associated with the distribution that you want.

If you tell us the distribution then we can tell you the quantile function you need.

18
votes

If you want different orders of magnitude, why not simply try pow(2, rand())? Or perhaps choose the order directly as rand(), as Harold suggested?

13
votes

@C4stor made a great point. But, for a more general case and easier to understand for human (base 10): for the range from 1 to 10^n, ~90% of the numbers are from 10^(n-1) to 10^n, therefore, ~99% of the numbers go from 10^(n-2) to 10^n. Keep adding as many decimals as you want.

Funny mathematics, if you keep doing this for n, you can see that from 1 to 10^n, 99.9999...% = 100% of the numbers are from 10^0 to 10^n with this method.

Now about code, if you want a random number with random orders of magnitude, from 0 to 10^n, you could do:

  1. Generate a small random number from 0 to n

  2. If you know the range that n has, generate a big random number of order 10^k where k > max{n}.

  3. Cut the longer random number to get the n digits of this big random number.

13
votes

The basic (and correct) answer was already given and accepted above: there are 10 numbers between 0 and 9, 90 numbers between 10 and 99, 900 between 100 and 999, etc.

For a computationally efficient way to get a distribution with approximately logarithmic distribution, you want to right-shift your random number by a random number:

s = rand() & 31; // a random number between 0 and 31 inclusive, assuming RAND_MAX = 2^32-1
r = rand() >> s; // right shift

It's not perfect, but it's much faster than computing pow(2, rand()*scalefactor). It will be "lumpy" in the sense that the distribution will be uniform for numbers within a factor 2 (uniform for 128 to 255, half the density for 256 to 1023, etc).

Here is a histogram of the frequency of the numbers 0 to 31 (in 1M samples):

enter image description here

5
votes

There are exactly equal number of numbers between 0 and 2^29 and 2^29 and 2^30.

Another way of looking at the problem: consider binary representation of the random number you generate, the probability that the highest bit is 1 equals 1/2, and, therefore, you get order 29 in half cases. What you want is to see a number that would be below 2^25, but that means 5 highest bits are all zero, which happens with a low probability of 1/32. Chances are that even if you run it for a long time you will never see the order below 15 at all (the probability is something like rolling 6 6 times in a row).

Now, the part of your question about the seed. No, the seed cannot possibly determine the range the numbers are generated from, it just determines the first, initial element. Think of rand() as a sequence of all possible numbers in the range (predetermined permutation). The seed determines where you start drawing numbers from the sequence. This is why if you want (pseudo) randomness, you use current time to initialize the sequence: you do not care that the position you start from is not uniformly distributed, all that matters is that you never start from the same position.

2
votes

use pow(2,rand()) it will give the answers in order of desired magnitude!!

2
votes

If you want to use random numbers from an online service you can use wget for that, you may want to see you can also use services like random.org for your random number generation , you can catch them using wget and then reading the numbers from the downloaded file

wget -q https://www.random.org/integers/?num=100&min=1&max=100&col=5&base=10&format=html&rnd=new -O new.txt

http://programmingconsole.blogspot.in/2013/11/a-better-and-different-way-to-generate.html