0
votes

I am using Random Forest from Sklearn for feature importance. However, the importance of features may change by changing the random_state parameter in RF. I am wondering if there is any way to get robust feature importance with RF?

1

1 Answers

0
votes

it is because of the principal of Random Forest algorithm. RF finds the optimal by heuristic greedy way. And working on such heuristic way, it mitigates multiple trees with randomly sampled features and samples. And here random_state gives random numbers for sampling. If you see below documents, it says

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

[https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html][1]

So if you set random_state with fixed value, you may have fixed value for feature importance. It does not guarantee robustness because RF is not the algorithms guarantee robustness, but gives answer based on its heuristic finding.