3
votes

I would like to randomly pick numbers from a sequence of 1:8 and save the picked numbers as observations of a new variable in a SAS data set. Each number in 1 to 8 will get the same chance to be picked(0.125). So, once the new variable is generated and I run proc freq on the variable, I will get close frequency distribution about 12.5% for each number in the sequence.

The R equivalent is something like this using the sample() function:

x <- sample(1:8, 1000, replace=T, 
                       prob=c(.125, .125, .125, .125, .125, .125, .125, .125)) 

But how can I do that in SAS? Many thanks!

1

1 Answers

4
votes

SAS has the rand function, which can produce any of a number of distributions. The uniform distribution sounds like what you want. That produces 0:1, so you just modify it to be 1:8.

data want;
 call streaminit(7);  *initialize random stream, pick whatever positive seed you want;
 do _n_=1 to 1000; *do 1000 times;
   x = ceil(rand('Uniform')*8);
   output;
 end;
run;

Another method is the 'Table' method, which is more directly similar to the r function.

data want;
 call streaminit(7);
 do _n_ = 1 to 1000;
  x = rand('Table',.125,.125,.125,.125,.125,.125,.125,.125);
  output;
 end;
run;

proc freq data=want;
table x;
run;

However, in this case Uniform should do just as well.

Note that this method (Uniform) is very slightly biased on the top end: because it cannot produce 1, 8 will happen very slightly less frequently than 1 through 7. (1 is 0 < x <= 1, 5 is 4 < x <= 5, but 8 is 7 < x < 8). If you're producing 1000 numbers that's not going to have a significant effect on things (we're looking at 2^63 range of numbers, so the missing 8 would happen extremely infrequently), but if you're producing very many numbers (on the order of 1e15 or thereabouts) it starts to be noticeable, and the Table method is superior - or use 9 instead of 8, and discard the 9s.