I use the randomForest package in R with a rolling window to predict returns on a financial time series (a stock). I have developed a basket of features for this purpose and my goal is to understand their relative predictive power.
My challenge is I cannot use the variable importance feature of random forest because most of my features have a high degree of correlation with their recent past. For example, a moving average spans a window of several days which means that it contains information across several observations in my data set.
This implies that the out-of-bag samples generated by random forest will be correlated with the in-sample features that random forests uses to train my model. Therefore the variable importance I would get from this would be highly optimistic and overfitted.
The solution I see is to somehow compute variable importance on an out-of-sample test set rather than using OOB cross-validation. The goal is to ensure absolutely no correlation with the training set.
My question: does a package exist in R to compute and extract variable importance from a test set rather than the standard OOB cross-validation set? If not, can you suggest an approach to achieve this objective? Thank you for your help.