1
votes

I'm having an issue with how proc summary behaves when variables in the class statement have missing values. In the example below, test_out will give all possible combinations of types. test_missing_out does not and does not take into account var3 for the sum where var2 was missing, despite the fact var1 was not missing:

data test;
    infile datalines dsd delimiter=' ';
    input var1 var2 $ var3;
    datalines;
1 data 200
2 data2 103
;
run;

proc summary
    data=test;
    class var1 var2;
    var var3;
    output out=test_out sum=sum;
run;


data test_missing;
    infile datalines dsd delimiter=' ';
    input var1 var2 $ var3;
    datalines;
1 data 200
2  103
;
run;

proc summary
    data=test_missing;
    class var1 var2;
    var var3;
    output out=test_missing_out sum=sum; 
run;
1

1 Answers

3
votes

proc summary has a lot in common with proc means concerning syntax. You can simply add the keyword MISSING to the proc summary statement if you want it to consider missing values as a grouping level:

proc summary
data=test_missing
MISSING;
class var1 var2;
var var3;
output out=test_missing_out sum=sum; 
run;