3
votes

I have a question about calculating an AUC for every individual in a dataset, after imputation using MICE.

I know how I can do it in a complete cases dataset. I have done it as follows:

id <- c(1,2,3,4,5,6,7,8,9,10)
measure_1 <- c(60,80,90,55,60,61,77,67,88,90)
measure_2 <- c(55,88,88,55,70,61,80,66,65,92)
measure_3 <- c(62,88,85,56,68,62,89,62,70,99)
measure_4 <- c(62,90,83,54,65,62,91,59,67,96)
dat <- data.frame(id, measure_1, measure_2, measure_3, measure_4)
dat
x <- c(0,7,14,21) # number of days

library(Bolstad2)
f <- function(a){
   Patient <- dat[a,]
   vector_patient <- c(Patient[2:5])
   AUCpatient <- sintegral(x,vector_patient)$int
   return(AUCpatient)
}

vector <- c(1:10)
listAUC <- lapply(vector, f)
vector_AUC <- unlist(listAUC, use.names=FALSE)
vector_AUC

This gave me a vector with all the AUCs for all patients. This vector can be added to my dataset if I want to.

But now I have a problem: I have missings in my dataset. My dataset can be obtained using the following code:

id <- c(1,2,3,4,5,6,7,8,9,10)
measure1 <- c(60,NA,90,55,60,61,77,67,88,90)
measure2 <- c(55,NA,NA,55,70,NA,80,66,65,92)
measure3 <- c(62,88,85,NA,68,62,89,62,70,99)
measure4 <- c(62,90,83,54,NA,62,NA,59,67,96)
datmis <- data.frame(id, measure1, measure2, measure3, measure4)
datmis

I want to impute this dataset using MICE.

library(mice)
imp <- mice(datmis, maxit = 0)
meth <- imp$method
pred <- imp$predictorMatrix
imp <- mice(datmis, method = meth, predictorMatrix = pred, seed = 2018, maxit = 10, m = 5)

So now I have everything imputed. I want to create the AUCs for every individual, in every imputed dataset. Then I want to pool the results, resulting in one single AUC for every individual. However, using the formula create in the previous example does not work anymore. Is there someone who can help me out?

1

1 Answers

0
votes

Here's one way to do it. After you run your imputation you can

  1. Run through each imputed dataset
  2. Compute the AUC with the impuated data
  3. Pool estimates together using Rubin's rule

The first 2 points are covered with the code below

x <- c(0,7,14,21) # number of days
library("tidyverse")
library("MESS")
res <- lapply(1:5, function(i) { 
    complete(imp, i) %>% 
    group_by(id) %>% 
    mutate(AUC=MESS::auc(x, c(measure1, measure2, measure3, measure4)))})

I'm using the auc function from the MESS since that is rather fast and flexible but you could replace it with your version.

This produces a list of 5 data frames that can be used for pooling the estimates (part 3 from the list above).

library("mitools")
with(imputationList(res), lm(AUC ~ 1)) %>% pool() %>% summary()

This produces

            estimate std.error statistic       df      p.value
(Intercept)  1512.77  81.62359  18.53349 7.389246 1.829668e-07

One more comment regarding the imputation: Are you sure you want to predict the measures using id as a numeric variable. That produces a regression-like predictor for the missing variables which seems rather unrealistic.