3
votes

I have a dataset with 300+ variables and I want to perform stepwise selection in PROC LOGISTIC (I understand stepwise selection is a bad idea here but it's not up to me) on all these variables - some of which are numeric and some of which are categorical.

Without typing the name of each of the 300+ variables, how do I write the model statement so that the model is all variables in my data set except for my response variable? How do I write the class statement so that it knows to treat all the categorical variables as categorical?

1
It may be helpful if you could mock up an example with 5 or 10 variables and 10 or 20 observations that looks sort-of like your data, and if you are able to, the PROC LOGISTIC that most closely approximates what you think you would want to run (hand coded).Joe
write in your model step model dependent_variable = var1 -- var300; and then specify in a class statement above it the class variables.gobrewers14

1 Answers

1
votes

You can quickly grab all the headings of your dataset to copy and paste with this:

proc contents data = X short;
run;

This will generate a list that you can copy and paste into your proc logistic statement.

Assuming your class variables are character based you can do the following:

proc contents data = X out=test;
run;

data test; set test; 
if TYPE=2;
run

proc transpose data=test out=test2;
var name;
id name;
run;

proc contents data = test2 short;
run;