0
votes

I am wondering the best way to transpose data in SAS when I have multiple occurances of my id variable. I know I can use the let option in the proc transpose statement to do this, but I do not want to get rid of any data, as I intend to compute averages.

Here is an example of my data and my code:

data grades;
input student testnum grade;
cards;
1   1   30
1   1   25
1   2   45
1   3   67
2   1   22
2   2   63
2   2   12
2   2   77
3   1   22
3   1   17
3   2   14
3   4   17
;
run;

proc sort data=grades;
    by student testnum;
run;

proc transpose data=grades out=trgrades;
    by student;
    id testnum;
    var grade;
run;

Here is how I would like my resulting dataset to look:

student testnum1 testnum2 testnum3 testnum4   avg12     avg34
1         30        45      67       .         33.33     67    
1         25        .        .       .         33.33     67
2         22        63       .       .         43.5      .
2         .         12       .       .         43.5      .
2         .         77       .       .         43.5      .
3         22        14       .      17         53        17
3         17        .        .       .         53        17

I want to use this new dataset (not sure how yet) to create the new columns that are the average score of all testnum1's and testnum2's for a student (avg12) and the average of all testenum3's and testnum4's (avg34) for a student.

There may be a much more efficient way to do this but I am stumped. Any advice is appreciated.

1

1 Answers

3
votes

If all you really need is the average of all test 1's and 2's, and 3's and 4's for each student, then you don't need to transpose at all. All you need is a simple data step:

data grouped;
   set grades;
   if testnum In (1,2) then group=1;
   else if testnum in (3,4) then group=2;
run;

Then a basic proc means:

proc means data=grouped;
   by student group;
   var grade;
   output out=averages mean=groupaverage;
run;

If you need the averages in a single observation, you can easily transpose the averages dataset.

proc transpose data=grades out=trgrades;
    by student;
    id group;
    var grade;
run;

Update:

As mentioned by @Keith, using a format to group the tests is an excellent choice as well. Skip the data step and create the format like so:

proc format;
   value TestGroup
      1,2 = 'Tests 1 and 2'
      3,4 = 'Tests 3 and 4'
   ;
run;

Then the proc means becomes:

proc means data=grouped;
   by student testnum;
   var grade;
   format testnum TestGroup.;
   output out=averages mean=groupaverage;
run;

End Update


If, for some reason, you really need to have all the test scores in one observation then I would recommend using a data step to make them uniquely identifiable. Use by, testnum.first, retain, and a simple counter to assign each score a retake number. Now your transpose uses retake and testnum as id variables. You should be able to figure it out from there.

Really hoping right now that I didn't just do your SAS homework assignment for you.