When trying to test my trained model on new test data that has fewer factor levels than my training data, predict()
returns the following:
Type of predictors in new data do not match that of the training data.
My training data has a variable with 7 factor levels and my test data has that same variable with 6 factor levels (all 6 ARE in the training data).
When I add an observation containing the "missing" 7th factor, the model runs, so I'm not sure why this happens or even the logic behind it.
I could see if the test set had more/different factor levels, then randomForest would choke, but why in the case where training set has "more" data?
levels()
of the factor. – MrFlicktest$val <- factor(test$val, levels=levels(train$val))
or something like that. You don't exactly have a reproducible example here so it's difficult to be specific – MrFlick