1
votes

After using mice to create 50 imputations of my dataset, I am keen to use the package glmnet to run an elastic net. I understand that the appropriate way to analyse imputed data is by applying the with and pool functions to the mids object created when mice(x,...) is run, but glmnet requires its data to be fed as a matrix. Both model.matrix and build.x can be used to convert a generic data frame to a matrix. The mids object can be converted to a data.frame; however, using the available data as a single dataset would appear to undermine the whole imputation process.

Example:

df <- mice::nhanes
imp <- mice(df) #impute data
com <- complete(imp, "long", TRUE) #creates data frame
mat <- build.x(bmi ~ age + hyp + chl, com, contrasts = FALSE)

Assuming the imputations are accurate, what is the most appropriate way to preserve the imputations and create the relevant matrices for use in glmnet?

1

1 Answers

2
votes

The easiest way to do this is to use my glmnetUtils package, which implements a formula/data frame interface for glmnet. Then fit your elastic net like with any other R model-building function.

install.packages("glmnetUtils")
library(glmnetUtils)

# ... do whatever is required to create an analysis data frame ...

glmnet(bmi ~ age + hyp + chl, data=com)