0
votes

When I am training a xgboost and used AUC as metric to evaluate the performance, I notice first several rounds' AUC score is always 0.5. Basically it means the first several trees did not learn anything:

Multiple eval metrics have been passed: 'eval-auc' will be used for early stopping.

Will train until eval-auc hasn't improved in 20 rounds.
[0] train-auc:0.5   eval-auc:0.5
[1] train-auc:0.5   eval-auc:0.5
[2] train-auc:0.5   eval-auc:0.5
[3] train-auc:0.5   eval-auc:0.5
[4] train-auc:0.5   eval-auc:0.5
[5] train-auc:0.5   eval-auc:0.5
[6] train-auc:0.5   eval-auc:0.5
[7] train-auc:0.5   eval-auc:0.5
[8] train-auc:0.5   eval-auc:0.5
[9] train-auc:0.5   eval-auc:0.5
[10]    train-auc:0.5   eval-auc:0.5
[11]    train-auc:0.5   eval-auc:0.5
[12]    train-auc:0.5   eval-auc:0.5
[13]    train-auc:0.5   eval-auc:0.5
[14]    train-auc:0.537714  eval-auc:0.51776
[15]    train-auc:0.541722  eval-auc:0.521087
[16]    train-auc:0.555587  eval-auc:0.527019
[17]    train-auc:0.669665  eval-auc:0.632106
[18]    train-auc:0.6996    eval-auc:0.651677
[19]    train-auc:0.721472  eval-auc:0.680481
[20]    train-auc:0.722052  eval-auc:0.684549
[21]    train-auc:0.736386  eval-auc:0.690942

As you can see, the first 13 rounds did not learn anything.

The parameter I used: param = {'max_depth':6, 'eta':0.3, 'silent':1, 'objective':'binary:logistic'}

using xgboost 0.8

Is there anyway to prevent this?

Thanks

1

1 Answers

1
votes

AUC equal 0.5 during the first several rounds does not mean that XGBoost does not learn. Check if your dataset is balanced. If not, all instances (they of target=1 and target=0) try to go from default 0.5 to the target mean, e.g. 0.17 (logloss improves, learning is going on), and then reach the region where improving logloss improves AUC. IF you want to help the algorithm to reach this region, change the default value of the parameter base_score=0.5 to the target mean. https://xgboost.readthedocs.io/en/latest/parameter.html