I'm debugging a code with Random Forest package, with barely no previous R experience.
I've reached a point where, excecuting predict.randomForest
, I get the error:
New factor levels not present in the training data.
Searching this site I've found the reason and understood that I need to delete the records that are causing the problem.
How can I isolate (find out) which columns/rows are causing the problems?
str(X)
, whereX
is the matrix of predictors in your training data. Then do the same in your test data, and look in the output to see which one(s) have different numbers or sets of levels. – ulfelder