0
votes

I have a sas dataset concerning grades. Here is a subset follows:

 Name:     Grade:      Subject:   
Bob         8         English
Bob         6         Maths
Bill        7         English
Joe         9         Maths

Is there a way I can find the average grade per person? In another language I understand I would group by Name, and end up with

 Name:     Grade:       
Bob         14                      
Bill        7         
Joe         9     

and then find the average ie sum(grade)/len(grade) to get the average grade per person i.e. (30/3 = 10).

Please note some of my observations appear many times like "Bob" in this example which is why a "group by" approach is necessary.

Is there a way to do this in sas?

4
Please ensure to include something you've tried to solve the problem so far.Reeza

4 Answers

2
votes

something like this in datastep

/*sorting*/
proc sort data=have out=have1;
by name;
run;


  data want(drop=new_grade);
  do until(last.name);
  set have1(rename=(grade=new_grade));
  by name;
 grade = sum(new_grade, grade);
end;
run;

or in proc sql

proc sql;
create table want as 
select name, sum(grade) as grade
from have
group by name;
1
votes

I show you example on standart SAS dataset. You must previously sort by variable(s) you want group

proc sort data=sashelp.class out=class(keep=Sex Name Weight);
by sex;
run;

Next add some retain vars (keep values in "next" rows) and condition with last or first expressions

data res;
set class;
by sex;
retain sum 0 n 0;
sum+Weight;
n+1;

if last.sex then
    do;
        avg=sum/n;
        n=0;
        Sum=0;
    end;
run;

result

+---------+-----+--------+-------+---+--------------+
|  Name   | Sex | Weight |  sum  | n |     avg      |
+---------+-----+--------+-------+---+--------------+
| Alice   | F   |     84 |    84 | 1 |            . |
| Barbara | F   |     98 |   182 | 2 |            . |
| Carol   | F   |  102.5 | 284.5 | 3 |            . |
| Jane    | F   |   84.5 |   369 | 4 |            . |
| Janet   | F   |  112.5 | 481.5 | 5 |            . |
| Joyce   | F   |   50.5 |   532 | 6 |            . |
| Judy    | F   |     90 |   622 | 7 |            . |
| Louise  | F   |     77 |   699 | 8 |            . |
| Mary    | F   |    112 |     0 | 0 | 90.111111111 |
| Alfred  | M   |  112.5 | 112.5 | 1 |            . |
| Henry   | M   |  102.5 |   215 | 2 |            . |
| James   | M   |     83 |   298 | 3 |            . |
| Jeffrey | M   |     84 |   382 | 4 |            . |
| John    | M   |   99.5 | 481.5 | 5 |            . |
| Philip  | M   |    150 | 631.5 | 6 |            . |
| Robert  | M   |    128 | 759.5 | 7 |            . |
| Ronald  | M   |    133 | 892.5 | 8 |            . |
| Thomas  | M   |     85 | 977.5 | 9 |            . |
| William | M   |    112 |     0 | 0 |       108.95 |
+---------+-----+--------+-------+---+--------------+
1
votes

You can do this directly through SQL in SAS.

proc sql noprint;
    create table Avg_Grades as
        select Name, mean(grade) as Avg_Grade
        from have
        group by Name
        ;
quit;
1
votes

Your best approach here is to use a summary procedure, especially if you want statistics other than just the mean. Note that if you use a CLASS statement you do not have to pre-sort your data.

  1. On the PROC MEANS statement there are several options:
    • NWAY - only includes the highest level, otherwise SAS will add an overall total as well
    • N/MEAN/NMISS/MEDIAN - statistics to be prepared
    • STACKODS changes the output format, this is usually the form most people want.
  2. CLASS -> grouping variable
  3. VAR -> variable to analyze
  4. ODS OUTPUT -> sends results to a table called WANT.

    proc means data=sashelp.class NWAY N MEAN NMISS MEDIAN STACKODS;
    CLASS sex; *this is your grouping variable(s), ie name;
    VAR height; *this is the variable(s) you want to summarize;
    ods output summary = want;
    run;