3
votes

I have a data set with many variables - many of which are character valued. I have the following code to count the number of missing values for each variable:

proc format;
 value $missfmt ' '='Missing' other='Not Missing';
 value  missfmt  . ='Missing' other='Not Missing';
run;


proc freq data=dataname; 
format _CHAR_ $missfmt.; /* apply format for the duration of this PROC */
tables _CHAR_ / missing missprint nocum nopercent;
format _NUMERIC_ missfmt.;
tables _NUMERIC_ / missing missprint nocum nopercent;
run;

However, this results in a huge output (300 page pdf if I print to pdf) with 90% of the variables having no missing values. How do I tell PROC FREQ to only display the tables which have missing values?

2

2 Answers

4
votes

You can identify which variables have a missing value from the NLEVELS option in PROC FREQ. So my process would be to create a dataset that just held the variables with missing values, then store them in a macro variable so the following PROC FREQ can be run against them only. Here is the code to do that.

/* set up dummy dataset */
data have;
set sashelp.class;
if _n_ in (10,13) then call missing(age,sex);
run;

/* create dataset that holds variables with missing values */
ods select nlevels;
ods output nlevels=miss_vars (where=(nmisslevels>0));
ods noresults;
proc freq data=have nlevels;
run;
ods results;

/* store names in a macro variable */
proc sql noprint;
select tablevar into :missvar separated by ' '
from miss_vars;
quit;

proc format;
 value $missfmt ' '='Missing' other='Not Missing';
 value  missfmt  . ='Missing' other='Not Missing';
run;

proc freq data=have (keep=&missvar.); 
format _CHAR_ $missfmt.; /* apply format for the duration of this PROC */
tables _CHAR_ / missing missprint nocum nopercent;
format _NUMERIC_ missfmt.;
tables _NUMERIC_ / missing missprint nocum nopercent;
run;
1
votes

This one removes all blank columns:

%macro removeblanks(dataset,output);
/* create dataset that holds variables with missing values */
ods select nlevels;
ods output nlevels=miss_vars (where=(nmisslevels>0 and nnonmisslevels=0));
ods noresults;
proc freq data=&dataset. nlevels;
run;

/* store names in a macro variable */
proc sql noprint;
select tablevar into :missvar separated by ' '
from miss_vars;
quit;

data &output.;
set &dataset.(drop=&missvar.);
run;

%mend removeblanks;`