0
votes

I have a very big SAS Dataset with over 280 variables in it and I need retrieve all the complete NULL columns based on a Variable value. For example I have a Variable called Reported(with only values Yes & No) in this dataset and I want to find out based on value No, all the complete Null Columns in this dataset.

Is there any quick way to find this out with out writing all the columns names for complete NULL values?

So for example if I have 4 Variables in the table,

enter image description here

So based on the above table I would like to see the output like this where Var4='No' and only return the columns with all the missing values

enter image description here

This would help me to identify variables which are not being populated at all where the Var4 value is 'No'

2
Your question is confusing. Do you want to check for NULL values (or in SAS terminology MISSING values) or values of 'No'? - Tom
yes I want all the variables where with MISSING values only based on Reported variable value 'No' - LondonUK
Still not clear. Are you saying the NAME of the variable to check is NO? If so why mention that there are 280 variables? Please post a small example (with just 2 or 3 variable, not 280 variables) and expected result for the example. - Tom
I have updated the original question with an example. Thanks - LondonUK
So now it looks like you want find OBSERVATIONS and not VARIABLES. The observations you want are those with VAR4='No' and missing values for all of the other columns. - Tom

2 Answers

3
votes

Note the WHERE statement in PROC FREQ.

proc format;
   value $_xmiss_(default=1 min=1 max=1) ' '  =' ' other='1';
   value  _xmiss_(default=1 min=1 max=1) ._-.Z=' ' other='1';
   quit;

%let data=sashelp.heart;


proc freq data=&data nlevels;
   where status eq: 'A';
   ods select nlevels;
   ods output nlevels=nlevels;
   format _character_ $_xmiss_. _numeric_ _xmiss_.;
   run;

data nlevels;
   length TABLEVAR $32 TABLEVARLABEL $128 NLEVELS NMISSLEVELS NNONMISSLEVELS 8;
   retain NLEVELS NMISSLEVELS NNONMISSLEVELS 0;
   set nlevels;
   run;
2
votes

There are 2 parts to the question I think. First is to subset records where Reported = "N". Then among those records, report columns that have all missing values. If this is correct then you could do something as follows (I am assuming that the columns with missing values are all numeric. If not, this approach will need a slight modification):

/* Thanks to REEZA for pointing out this way of getting the freqs. This eliminates some constraints and is more efficient */
proc freq data=have nlevels ;
  where var1 = "N" ;
  ods output nlevels = freqs;
  table _all_;
run;

proc sql noprint;
  select TableVar into :cols separated by " " from freqs where NNonMissLevels = 0 ;
  quit;

%put &cols;


data want;
  set have (keep = &cols var1);
  where var1 = "N" ;
run;