1
votes

I am using Amelia package in R to handle missing values.I get the below error when i am trying to train the random forest with the imputed data. I am not sure how can i convert amelia class to data frame which will be the right input to the randomForest function in R.

train_data<-read.csv("train.csv")
sum(is.na(train_data))

impute<- amelia(x=train_data,m=5,idvars=c("X13"), interacs=FALSE)
impute<= as.data.frame(impute)

for(i in 1:impute$m) {  
  model <- randomForest(Y ~X1+X2+X3+X4+X5+X6,
                 data= as.data.frame(impute))
}

Error in as.data.frame.default(impute) : 
  cannot coerce class ""amelia"" to a data.frame

If I used input to randomForest as impute$imputations[[i]] I the below error:

 model <- randomForest(Y ~X1+X2+X3+X4+X5+X6,
                 impute$imputations[[i]])
Error: $ operator is invalid for atomic vectors

Can anyone suggest me how can I solve this problem .It would be a great help.

2
@RichardScriven Did you mean I should do impute<-as.matrix(impute)?Nikita
I think you should look at unclass(impute) first. That should give you some idea of what the object actually looks like. Forget about as.matrix, I was wrong about thatRich Scriven
@RichardScriven Thanks for the explanation. I am still not able to handle this error. I am getting the similar error while using aregImpute. When I give the imputed testing data set to the predict function. impute_valid<- aregImpute(Y~X1+X2+X3+X4+X5+X6,data= test_data, n.impute=5,nk=0) predicted_valid< -predict(model,newdata=impute_valid,type="response") Error in as.data.frame.default(data) : cannot coerce class ""aregImpute"" to a data.frameNikita

2 Answers

0
votes

So, I think the first problem is this line here:

impute<= as.data.frame(impute)

Should be:

impute <- as.data.frame(impute)

Which will throw an error.

Multiple imputation replaces the data with multiple datasets, each with different replacements for the missing values. This reflects the uncertainty in those missing values predictions. By turning the Amelia object into a dataframe you are trying to make one data frame out of 5 data frames, and it's not obvious how to do this.

You might want to look into simpler forms of imputation (like imputing by the mean).

-1
votes

This is happening because you are trying to train on variable containing information on imputation you did. It does not have data you need to train on. You need to use the function complete to combine the imputed values in data set.

impute <- amelia(x=train_data,m=5,idvars=c("X13"), interacs=FALSE)
impute <- complete(impute,1)
impute <- as.data.frame(impute)

After this you won't have trouble training or predicting the data.