1
votes

I am trying to conduct a stepwise logistic regression in r with a dichotomous DV. I have researched the STEP function that uses AIC to select a model, which requires essentially having a NUll and a FULL model. Here's the syntax I've been trying (I have a lot of IVs, but the N is 100,000+):

Full = glm(WouldRecommend_favorability ~ i1 + i2 + i3 + i4 + i5 + i6.....i83 + b14 + 
                                         Shift_recoded, data = ee2015, family = "binomial")
Nothing = glm(WouldRecommend_favorability ~ 1, data = ee2015, family = "binomial")
Full_Nothing_Step = step(Nothing, scope = Full,Nothing, scale = 0, direction = c('both'), 
                         trace = 1, keep = NULL, steps = 1000, k = 2)

One thing I am not sure about is the order in which "Nothing" and "Full" should be entered in the step formula. Whichever way I try, when I print a summary of "Full_Nothing_Step," it only gives me a summary of either "Nothing" or "Full:"

Call:
glm(formula = WouldRecommend_favorability ~ 1, family = "binomial", 
    data = ee2015)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.8263   0.1929   0.1929   0.1929   0.1929  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  3.97538    0.01978     201   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 25950  on 141265  degrees of freedom
Residual deviance: 25950  on 141265  degrees of freedom
AIC: 25952

Number of Fisher Scoring iterations: 6 

I am pretty familiar with logistic regression in general but am new to R.

1

1 Answers

2
votes

As the documentation states, you can enter scope as a formula and or a list with both upper and lower bounds to search over.

In the example below, my initial model is lm1, I then implement the stepwise procedure in both directions. The bounds of this selection procedure are a model with all interaction terms while the lower bound is all terms. You can easily adapt this to a glm model and add the additional arguments you desire.

Be sure to read through the help page though.

lm1 <- lm(Fertility ~ ., data = swiss)

slm1 <- step(lm1, scope = list(upper = as.formula(Fertility ~ .^2),
                               lower = as.formula(Fertility ~ .)),
             direction = "both")