2
votes

In SAS I have created a program that will randomly take 50 observations from a data set and calculate a mean value for the observations.

data subset (drop=i samplesize);
samplesize=50;
obsleft=totobs;
do i=1 to samplesize;
  obsnum=ceil(ranuni(0)*totobs);
  set sashelp.baseball point=obsnum nobs=totobs;
  output;
end;
stop;
run;


proc sql;
select mean(nHome) from subset;
quit;

I would like to edit the code so it will produce 10 independent random samples instead of one (I am aware of the reps= in Proc SurverySelect, but I am not supposed to use it here). Thanks

1

1 Answers

2
votes

The k/n algorithm selects a fixed number samples, each with a probability of 1/n.

%let SEED = 1234;

data mySurveySelection;

  retain k 10; drop k;
  length select_n_ 8;

  set sashelp.baseball nobs=n;

  if (ranuni(&SEED) <= k/n) then do;
    k = k - 1;
    select_n_ = _n_;
    output;
  end;
  n = n - 1;

  if n = 0 then stop;
run;

You didn't ask for a proof that the selections are indeed 1/n, so I wont show that.

SurveySelect is typically used in any production level study or code base.