
According to the xgboost documentation (https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.training) the xgboost returns feature importances:


Feature importances property


Feature importance is defined only for tree boosters. Feature importance is only defined when the decision tree model is chosen as base learner ((booster=gbtree). It is not defined for other base learner types, such as linear learners (booster=gblinear).

Returns: feature_importances_

Return type: array of shape [n_features]

However, this does not seem to case, as the following toy example shows:

import seaborn as sns
import xgboost as xgb

mpg = sns.load_dataset('mpg')

toy = mpg[['mpg', 'cylinders', 'displacement', 'horsepower', 'weight',

toy = toy.sample(frac=1)

N = toy.shape[0]

N1 = int(N/2)

toy_train = toy.iloc[:N1, :]
toy_test = toy.iloc[N1:, :]

toy_train_x = toy_train.iloc[:, 1:]

toy_train_y = toy_train.iloc[:, 1]

toy_test_x = toy_test.iloc[:, 1:]

toy_test_y = toy_test.iloc[:, 1]

max_depth = 6
eta = 0.3
subsample = 0.8
colsample_bytree = 0.7
alpha = 0.1

params = {"booster" : 'gbtree' , 'objective' : 'reg:linear' , 'max_depth' : max_depth, 'eta' : eta,\
             'subsample' : subsample, 'colsample_bytree' : colsample_bytree, 'alpha' : alpha}

dtrain_toy = xgb.DMatrix(data = toy_train_x , label = toy_train_y)
dtest_toy = xgb.DMatrix(data = toy_test_x, label = toy_test_y)
watchlist = [(dtest_toy, 'eval'), (dtrain_toy, 'train')]

xg_reg_toy = xgb.train(params = params, dtrain = dtrain_toy, num_boost_round = 1000, evals = watchlist, \
                early_stopping_rounds = 20)

AttributeError                            Traceback (most recent call last)
<ipython-input-378-248f7887e307> in <module>()
----> 1 xg_reg_toy.feature_importances_

AttributeError: 'Booster' object has no attribute 'feature_importances_'
have you tried with xgboost sklearn, because it works for meJeril
Yes indeed the scikit learn API returns the feature importances.user8270077

What you are using is Learning API, but you are referencing to Scikit-Learn API. And only Scikit-Learn API have the attribute feature_importances.


For someone who is not using Scikit-Learn API like me, because of obvious reasons. From here I was able to get the importance of the feature:


Also, I was looking into a more intuitive representation here:

from xgboost import plot_importance
plot_importance(clf, max_num_features=10)

This generates the bar chart with specified (optional) max_num_features in the order of their importance.