I have a factor with missing values. I know that this factor value depends on the combination of a few dates.
I'm having some trouble getting this to work though. Seems both classes are tricky, especially Date
.
For a simple example lets have 1 Date
and 1 factor:
require(VIM)
toimpute <- data.frame(mydates = seq(as.Date("1990-01-01"),as.Date("2000-01-01"),50),
imputeme = c(NA,NA,rep(c("a","b","c"),24)))
toimpute$imputeme <- as.factor(toimpute$imputeme)
It seems kNN won't go for it:
imputed <- kNN(toimpute,variable = "imputeme")
Error in
[.data.frame
(data.x, , i) : undefined columns selected
mice
also doesn't like it. I thought mice
was at least supposed to work with factors, though this message says it must be numeric
(perhaps it allows factor
dependent variables but only numeric
for independent variables?):
imputed <- mice(toimpute)
iter imp variable 1 1 imputeme Error in FUN(newX[, i], ...) : 'x' must be numeric In addition: Warning messages: 1: In var(data[, j], na.rm = TRUE) : Calling var(x) on a factor x is deprecated and will become an error. Use something like 'all(duplicated(x)[-1L])' to test for a constant vector. 2: In FUN(newX[, i], ...) : NAs introduced by coercion
I guess if nothing else I can do a random forest model to predict the class of the observations with missing data, but if there's a way to do it with one of the more common missing value functions I'd like to know.
aregImpute
works on factor variables. Check this link – Joseph Woodtranscan
fromHmisc
. – Joseph Wood