is it neccessary to run random forest with cross validation at the same time

Question

Random forest is a robust algorithm. In Random Forest, it trains several small trees and have OOB accuracy. However, is it necessary to run cross-validation with random forest at the same time ?

Rob Neuhaus Rob Neuhaus · Accepted Answer · 2013-03-25T14:50:46

OOB error is an unbiased estimate of the error for random forests, so that's great. But what are you using the cross validation for? If you are comparing the RF against some other algorithm that isn't using bagging in the same way, you want a low variance way to compare them. You have to use cross validation anyway to support the other algorithm. Then using the cross validation sample splits for the RF and the other algorithm is still a good idea, so that you get rid of the variance caused by the split selection.

If you are comparing one RF against another RF with a different feature set, then comparing OOB errors is reasonable. This is especially true if you make sure both RFs use the same bagging sets during training.

is it neccessary to run random forest with cross validation at the same time

2 Answers