1
votes

I just can't figure out what is wrong with my code. I fitted a logistic regression model with this dataset:

  outcome predictor
1       0        -3 
2       0        -4
3       1         0
4       1        -1
5       1         0
6       0        -3

I fitted this model:

model <- glm(data$outcome~data$predictor, family = "binomial")

               Estimate Std. Error    z value     Pr(>|z|)
(Intercept) -0.01437719 0.07516923 -0.1912644 8.483185e-01
pvalue.us    0.19469804 0.03110934  6.2585081 3.886777e-10

Then I want to make predictions using this vector:

test
[1] -2 -5  0 -3  2 -3

predict(model, newdata = test)

And I get this error:

Error in eval(predvars, data, env) : 
  numeric 'envir' arg not of length one

What is wrong?

1

1 Answers

2
votes

If you want to use functions like predict() you shouldn't use $-indexing in your model; use the data= argument instead, e.g.

 model <- glm(outcome~predictor, data=your_data, family = "binomial")

If you use $ in your formula then the predict() function will not actually use the variables found in the new data frame.

Using the example given:

 model <- glm(data$outcome~data$predictor, family = "binomial")
 predict(model,newdata=data.frame(predictor=1:6))
 ##         1         2         3         4         5         6 
 ## -23.48969 -46.57791  45.77497  22.68675  45.77497 -23.48969
 predict(model,newdata=data.frame(predictor=rep(0,6)))
 ##         1         2         3         4         5         6 
 ## -23.48969 -46.57791  45.77497  22.68675  45.77497 -23.48969 

The results are the same despite using different newdata (!). You'll only get a warning if you use newdata that's a different length from your original data set.