I'm trying to solve the titanic data set from kaggle. I have done almost all the work on train data set train (891 obs of 12 variables) test (418 obs of 11 variables)
I have used decision trees (rpart method)
confusionMatrix(pred_train,train$Survived) Confusion Matrix and Statistics
Reference
Prediction 0 1
0 549 0
1 0 342
Accuracy : 1
95% CI : (0.996, 1)
No Information Rate : 0.616
P-Value [Acc > NIR] : <0.0000000000000002
Kappa : 1
Mcnemar's Test P-Value : NA
Sensitivity : 1.000
Specificity : 1.000
Pos Pred Value : 1.000
Neg Pred Value : 1.000
Prevalence : 0.616
Detection Rate : 0.616
Detection Prevalence : 0.616
Balanced Accuracy : 1.000
'Positive' Class : 0
I use pred <- predict (fit ,test ,type = "class")
I get
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = attr(object, : factor Name has new levels Abbott, Master. E...
how can I solve this problem as there is difference in observations of train and test data set (891 and 418) and I have already removed identifier(passengerId) from train data set