1
votes

I have the model

am.glm = glm(formula=am ~ hp + I(mpg^2), data=mtcars, family=binomial)

which gives

> summary(am.glm)

Call:
glm(formula = am ~ hp + I(mpg^2), family = binomial, data = mtcars)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.5871  -0.5376  -0.1128   0.1101   1.6937  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)  
(Intercept) -18.71428    8.45330  -2.214   0.0268 *
hp            0.04689    0.02367   1.981   0.0476 *
I(mpg^2)      0.02811    0.01273   2.207   0.0273 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 43.230  on 31  degrees of freedom
Residual deviance: 20.385  on 29  degrees of freedom
AIC: 26.385

Number of Fisher Scoring iterations: 7

Given a value of hp I would like to find the values of mpg that would lead to a 50% probability of am.

I haven't managed to find anything that can be used to output such predictions. I have managed to code something using

#Coefficients
glm.intercept<-as.numeric(coef(am.glm)[1])
glm.hp.beta<-as.numeric(coef(am.glm)[2])
glm.mpg.sq.beta<-as.numeric(coef(am.glm)[3])
glm.hp.mpg.beta<-as.numeric(coef(am.glm)[4])

#Constants
prob=0.9
c<-log(prob/(1-prob))
hp=120

polyroot(c((glm.hp.beta*hp)+glm.intercept-c, glm.hp.mpg.beta*hp,glm.mpg.sq.beta))

Is there a more elegant solution? Perhaps a predict function equivalent?

1
Did my answer do what you were looking for?JanLauGe

1 Answers

1
votes

Interesting problem!

How about the solution below? Basically, create newdata for which your target variable is sampling the range of observed values. Predict for the vector of these values, and find the minimum value that meets your criteria

# Your desired threshold
prob = 0.5

# Create a sampling vector
df_new <- data.frame(
  hp = rep(120, times = 100),
  mpg = seq(from = range(mtcars$mpg)[1], 
            to = range(mtcars$mpg)[2], 
            length.out = 100))

# Predict on sampling vector
df_new$am <- predict(am.glm, newdata = df_new)
# Find lowest value above threshold
df_new[min(which(df_new$am > prob)), 'mpg']