I am having a data frame with 15 variables and 4669 observations.
I am using random forest for modelling. My target from my data set is to predict is a particular product will be accepted by the customer or not.
so, my output variable has factors of "Yes", "No" and "".
My question is, Is it possible for me to predict this "" , as Yes or No in random Forest ?
Sample data looks like below
Outputvar <- c("Yes", "Yes", "No", "NO", "", "")
Inputvar1 <- c("M", "F", "F", "M", "F", "M")
Inputvar2 <- c("34","25","40","50","60","34")
data <- data.frame(cbind(Outputvar,Inputvar2,Inputvar1))
I am new to R, and if my understanding is wrong, then could any one explain me what could be done ?
EDIT: this is the code I have tried till now
library(RandomForest)
data$outvar <- factor(data$outputvar, exclude = NULL)
ind0 <- sample(2, nrow(data), replace = TRUE, prob = c(0.7,0.3))
train0 <- data[ind0==1, ]
test0 <- data[ind0==2, ]
fit1 <- randomForest(outputvar1~., data=train0)
print(fit1)
plot(fit1)
EDIT2: NO : 3536 Yes: 1061 "" : 72
data$Outputvar <- factor(data$Outputvar, exclude=NULL)
– Marco Sandridata$Outputvar
. You should correct this issue. – Marco Sandri