1
votes

I'm very new to both SAS and statistical analysis in general. I have a degree in computer science and I'm taking an online course in statistics and am confused on how to achieve what I want in SAS. I have searched online to no avail but am probably not even using the right terminology since I don't really know SAS or stats very well.

Basically, I have a few variables in a dataset and I want to display them conditionally in frequency tables.

For example, let's say I have the variables Gender, Age and Alcohol_use. What I want to do is have a frequency table or tables that basically relate gender and age to alcohol use. So, I want to break it down by gender and age at the same time, if that makes sense. One example would be:

Male, 21-25 -> Moderate Use
Female, 21-25 -> Low Use
Male, 26-30 -> Heavy Use
etc...

So, I guess I want to have frequency tables for the third variable on certain conditions of the first two variables, if that makes sense.

Normally, when displaying frequency tables, I just write PROC FREQ; TABLES Gender Age Alcohol_use;

Would I be changing anything there, since it is the frequency table that is affected? Or do I need to add some conditions in the data section of the program?

Any help would be great. Please let me know if you need any clarification on my question. Thanks!

1
Is Alcohol_use a measure? If so, what values define Heavy, Moderate, and Low usage? Or are those the values of the variable?BellevueBob
Sorry for the confusion, I was just giving an example and meant Alcohol_use is a variable (what our professor calls it) and Heavy, Moderate and Low use are the values that variable could have. I meant it analogously to Gender being a variable with the values of Male or Female.kyro1021
I'm guessing you're here from Coursera/Passion Driven Statistics? In either case, welcome!Joe
Haha, yes, I am! Thank you!kyro1021

1 Answers

2
votes

You are on the right track with PROC FREQ. That procedure will produce a frequency table report and even an output data set with results. First, here is some made-up data:

data have;
   do gender = 1,2;
      do tmp=1 to 10;
         do age=10,21,27,32;
            alcohol_use = round(ranuni(12345)*100);
            id + 1;
            output;
         end;
      end;
   end;
run;

I'm separating it from the rest of the answer to better illustrate. The form of your variables wasn't clear from your question so let's assume your data is continuous. In that case, we can use PROC FORMAT to define grouping formats for the variables:

proc format;
   value agefmt
      0-20   = '20 and below'
     21-25   = '21-25'
     26-30   = '26-30'
     31-high = '31 and above';
   value usgfmt
      0<-30   = 'Low'
     30<-80   = 'Moderate'
     80<-high = 'Heavy';
   value genfmt
      1  = 'Male'
      2  = 'Female';
run;

Now its just a matter of running PROC FREQ. The asterisks in the TABLE statement define the interaction levels you want and the OUT= option give the name of a new SAS data set to create, which will contain the summarized results:

proc freq data=have;
   table gender * age * alcohol_use / list out=want;
   format gender genfmt. age agefmt. alcohol_use usgfmt.;
run;

If your original data has hard-coded values like "Male" and "Heavy", you don't need the PROC FORMAT part of the FORMAT statement.