1
votes

Sorry I'm new to a lot of the features of SAS - I've only been using for a couple months, mostly for survey data analysis but now I'm working with a dataset which has individual level data for a cross-over study. It's in the form: ID treatment period measure1 measure2 ....

What I want to do is be able to group these individuals by their treatment group and then output a variable with a group average for measure 1 and measure 2 and another variable with the count of observations in each group.

ie

ID trt per  m1  m2
1   1   1  101  75
1   2   2  135  89
2   1   1  103  77
2   2   2  140  87
3   2   1  134  79
3   1   2  140  80
4   2   1  156  98
4   1   2  104  78

what I want is the data in the form:

group a = where trt=1 & per=1
group b = where trt=2 & per=2
group c = where trt=2 & per=1
group d = where trt=1 & per=2

trtgrp avg_m1 avg_m2 n
  A      102   76    2
  B      ...    ...  ...
  C
  D

Thank you for the help.

2

2 Answers

4
votes

/Creating Sample dataset/

data test;
infile datalines dlm=" ";
input ID : 8.
      trt : 8.
      per : 8.
      m1 : 8.
      m2 : 8.;
      put ID=;
datalines;
1 1 1 101 75
1 2 2 135 89
2 1 1 103 77
2 2 2 140 87
3 2 1 134 79
3 1 2 140 80
4 2 1 156 98
4 1 2 104 78
;
run;

/Using proc summary to summarize trt and per/

  1. Variables(dimensions) on which you want to summarize would go into class
  2. Variables(measures) for which you want to have average would go into var
  3. Since you want to have produce average so you will have to write mean as the desired statistics.

Read more about proc summary here

and here


proc summary data=test nway;
class trt per;
var m1 m2;
output out=final(drop= _type_)
mean=;
run;
0
votes

The alternative method uses PROC SQL, the advantage being that it makes use of plain-English syntax, so the concept of a group in your question is maintained in the syntax:

PROC SQL;
  CREATE TABLE final AS
  SELECT 
  trt,
  per,
  avg(m1) AS avg_m1,
  avg(m2) AS avg_m2,
  count(*) AS n
  FROM
  test
  GROUP BY trt, per;
QUIT;

You can even add your own group headings by applying conditional CASE logic as you did in your question:

PROC SQL;
  CREATE TABLE final AS
  SELECT 
  CASE 
    WHEN trt=1 AND per=1 THEN 'A'
    WHEN trt=2 AND per=2 THEN 'B'
    WHEN trt=2 AND per=1 THEN 'C'
    WHEN trt=1 AND per=2 THEN 'D'
  END AS group
  avg(m1) AS avg_m1,
  avg(m2) AS avg_m2,
  count(*) AS n
  FROM
  test
  GROUP BY group;
QUIT;

COUNT(*) simply counts the number of rows found within the group. The AVG function calculates the average for the given column.

In each example, you can replace the explicitly named columns in the GROUP BY clause with a number representing column position in the SELECT clause.

GROUP BY 1,2

However, take care with this method, as adding columns to the SELECT clause later can cause problems.