2
votes

I am trying to determine confidence intervals for predicted probabilities from a binomial logistic regression in R. The model is estimated using lrm (from the package rms) to allow for clustering standard errors on survey respondents (each respondent appears up to 3 times in the data):

library(rms)
model1<-lrm(outcome~var1+var2+var3,data=mydata,x=T,y=T,se.fit=T)
model.rob<-robcov(model1,cluster=respondent.id)

I am able to estimate a predicted probability for the outcome using predict.lrm:

predicted.prob<-predict(model.rob,newdata=data.frame(var1=1,var2=.33,var3=.5),
type="fitted")

What I want to determine is a 95% confidence interval for this predicted probability. I have tried specifying se.fit=T, but this not permissible in predict.lrm when type=fitted.

I have spent the last few hours scouring the Internet for how to do this with lrm to no avail (obviously). Can anyone point me toward a method for determining this confidence interval? Alternatively, if it is impossible or difficult with lrm models, is there another way to estimate a logit with clustered standard errors for which confidence intervals would be more easily obtainable?

1
Close as more appropriate on another SE site. Without a data example, this is only a statistical question. Furthermore, Frank is more likely to see it over at CrossValidated.com than he is here, anyway.IRTFM
I'm unclear on which site is more appropriate for this type of question. The question is about programming but definitely touches on stat.Frank Harrell
@FrankHarrell I was considering offering the exp(fit +/- 1.96*se)/(1+ exp(fit +/- 1.96*se) ) strategy but after looking at ?predict.lrm figured there was a reason you didn't provide that. I thought perhaps there was a problem with not taking into account the covariances. As you can see, I hadn't gone down through the examples. And I falsely imagined you might not see it as soon if it sat here.IRTFM

1 Answers

4
votes

The help file for predict.lrm has a clear example. Here is a slight modification of it:

L <- predict(fit, newdata=data.frame(...), se.fit=TRUE)
plogis(with(L, linear.predictors + 1.96*cbind(- se.fit, se.fit)))

For some problems you may want to use the gendata or Predict functions, e.g.

L <- predict(fit, gendata(fit, var1=1), se.fit=TRUE)  # leave other vars at median/mode
Predict(fit, var1=1:2, var2=3)   # leave other vars at median/mode; gives CLs