I noticed a peculiar behavior of apply() when trying to replace NA with number 9 for multiple factor variables. I already defined the levels and labels of those variables. When I use ifelse() for each variable individually (e.g ifelse(is.na(x),9,x), it coerced the variable into integer, which is understandable. However, when I made a function to do exactly the same and use apply() over multiple columns, it coerced all variables into character. Adding one more step to convert them back to factor in the function doesn't help. Did I miss something or is it thing strange about apply() functions? Thanks!
a<-c(1,2,3,NA,2)
b<-c(2,1,2,2,NA)
a<-factor(a,levels=c(1,2,3),labels=c("First","Second","Third"))
b<-factor(b,levels=c(1,2,3), labels=c("AA","BB","CC"))
dat<-cbind(a,b)
replace.na<-function(x){
x<-as.factor(ifelse(is.na(x),9,x))
}
a<-ifelse(is.na(a),9,a)
str(a)
dat<-apply(dat,2,replace.na)
str(dat)
I would expect the apply() will produce the same type of variables, or at least using as.factor() in the function will coerce the variable into a factor.
apply
function usually returns a matrix and R matrices cannot contain factors. So it's not a peculiar behavior. It's a design feature. – IRTFMdat
object doesn't have any factor components either, since you usedcbind
. – IRTFM