1
votes

I have a multioutput random forest regressor and I want to calculate the feature importance on it

As read in some other question, somebody said to calculate the feature importance of the estimators. I didn't define any estimator as you would see below, I have no idea how many it created. That (for some reason) didn't throw an exception but, after running the code below, it says that object MultiOutputRegressor does not have estimators, but it does for RFR. If I try to access the original RFR in my model list I get the exception 'tuple has no attribute feature importances'

This code throws a

multioutput object has no attribute feature importance
m4 = MultiOutputRegressor(RandomForestRegressor())
m5 = m4.estimator[0]
feature_importances = pd.DataFrame(m4.feature_importances_, index = X_train.columns, columns=['importance']).sort_values('importance')
print(feature_importances)
feature_importances.plot(kind = 'barh')

This code return

tuple object has no feature importance
m4 = models[5]
#m5 = m4.estimator[0]
feature_importances = pd.DataFrame(m4.feature_importances_, index = X_train.columns, columns=['importance']).sort_values('importance')
print(feature_importances)
feature_importances.plot(kind = 'barh')

I only worked with classification problems before and I want to be able to display the feature importance in a similar manner

1

1 Answers

2
votes
m5 = m4.estimator[0]

replace .estimator[0] with .estimators_[0] and replace m4.feature_importances_ with m5.feature_importances_

after calling

m4 = MultiOutputRegressor(RandomForestRegressor())

You are supposed to fit some array using m4.fit(array1, array2) and you can see the estimators by calling m4.estimators_. You can then take 1 step further and find the feature importances by doing m4.estimators_[0].feature_importances_