i built a predictive model classifying persons into two categories having <=50k and >50k income
but as i open my file in excel or r to see final predictions i see in place of my values only the levels ( 1 and 2 ) which i assigned in first place to simplify process
please tell me how to retain my original values represented by levels rather than the levels
Here is the outline i followed
this is my target variable income.group here it is the initial state
str (train_gbW7HTd $Income.Group)
chr [1:32561] "<=50K" "<=50K" "<=50K" "<=50K" "<=50K" "<=50K" "<=50K" ...
now to apply decision trees i encoded my target variable into levels 1 and 2 i used the following code train$Income.Group <- match(train$Income.Group,unique(train$Income.Group)) i got
table(train$Income.Group)
1 2
24720 7841
i build decision tree like this set.seed(333)
fit <- rpart (Income.Group ~.,data = train, method = "class", control = rpart. control( minsplit = 20, minbucket = 100, maxdepth = 10, xval = 5) + )
make predictions pred <- predict(fit,test,type = "class")
pred_train <- predict(fit,train,type = "class")
confusionMatrix (pred_train,train$Income.Group)
saved my file solution.frame <- data.frame(ID = test$ID, Income.Group = pred)
write.csv(solution.frame,file = "final_solution.csv")
but my final csv file has levels 1 and 2 representing final predictions and not <=50k and >50k which i actually want. please tell me how to proceed . thanks in advance i already used solution.frame$Income.Group <- ifelse(solution.frame$Income.Group =="1","<=50k",">50k")
but it gave single value >50k to entire column of Income.Group
Please tell me what to do as i m stuck at this step and unable to complete my model submission.