How to use predict function with my pooled results from mice()?

Question

Hi I just started using R as part of a module in school. I have a data set with missing data and I have used mice() to impute the missing data. I'm now trying to use the predict function with my pooled results. However, I observed the following error:

Error in UseMethod("predict") : no applicable method for 'predict' applied to an object of class "c('mipo', 'data.frame')"

I have included my entire code below and I'd greatly apprciate it if y'all can help a novice out. Thanks!

```{r}
library(magrittr)
library(dplyr)
train = read.csv("Train_Data.csv", na.strings=c("","NA"))
test = read.csv("Test_Data.csv", na.strings=c("","NA"))
cols <- c("naCardiac", "naFoodNutrition", "naGenitourinary", "naGastrointestinal", "naMusculoskeletal", "naNeurological", "naPeripheralVascular", "naPain", "naRespiratory", "naSkin")
train %<>%
       mutate_each_(funs(factor(.)),cols)
test %<>%
       mutate_each_(funs(factor(.)),cols)
str(train)
str(test)
```

```{r}
library(mice)
md.pattern(train)
```

```{r}
miTrain = mice(train, m = 5, maxit = 50, meth = "pmm")
```

```{r}
model = with(miTrain, lm(LOS ~ Age + Gender + Race + Temperature + RespirationRate + HeartRate + SystolicBP + DiastolicBP + MeanArterialBP + CVP + Braden + SpO2 + FiO2 + PO2_POCT + Haemoglobin + NumWBC + Haematocrit + NumPlatelets + ProthrombinTime + SerumAlbumin + SerumChloride + SerumPotassium + SerumSodium + SerumLactate + TotalBilirubin + ArterialpH + ArterialpO2 + ArterialpCO2 + ArterialSaO2 + Creatinine + Urea + GCS + naCardiac + GCS + naCardiac + naFoodNutrition + naGenitourinary + naGastrointestinal + naMusculoskeletal + naNeurological + naPeripheralVascular + naPain + naRespiratory + naSkin))
model
summary(model)
```

```{r}
modelResults = pool(model)
modelResults
```

```{r}
pred = predict(modelResults, newdata = test)
PredTest = data.frame(test$PatientID, modelResults)
str(PredTest)
summary(PredTest)
```

Marius Marius · Accepted Answer · 2018-10-09T05:49:55

One slightly hacky way to achieve this may be to take one of the fitted models created by fit() and replace the stored coefficients with the final pooled estimates. I haven't done detailed testing but it seems to be working on this simple example:

library(mice)

imp <- mice(nhanes, maxit = 2, m = 2)
fit <- with(data = imp, exp = lm(bmi ~ hyp + chl))
pooled <- pool(fit)

# Copy one of the fitted lm models fit to
#   one of the imputed datasets
pooled_lm = fit$analyses[[1]]
# Replace the fitted coefficients with the pooled
#   estimates (need to check they are replaced in
#   the correct order)
pooled_lm$coefficients = summary(pooled)$estimate

# Predict - predictions seem to match the
#   pooled coefficients rather than the original
#   lm that was copied
predict(fit$analyses[[1]], newdata = nhanes)
predict(pooled_lm, newdata = nhanes)

As far as I know predict() for a linear regression should only depend on the coefficients, so you shouldn't have to replace other stored values in the fitted model (but you would have to if applying methods other than predict()).

How to use predict function with my pooled results from mice()?

1 Answers