RandomizedSearchCV & XGBoost + with Early Stopping

Question

I am trying to use 'AUCPR' as evaluation criteria for early-stopping using Sklearn's RandomSearchCV & Xgboost but I am unable to specify maximize=True for early stopping fit params. Instead the eval_metric minimizes for AUCPR.

I have already referred to this question: GridSearchCV - XGBoost - Early Stopping

But it seems the early stopping works only for minimization objectives? The best iteration in early stopping is considered when AUCPR is the lowest which is not the correct optimization.

    xgb = XGBClassifier()


    params = {
    'min_child_weight': [0.1, 1, 5, 10, 50],
    'gamma': [0.5, 1, 1.5, 2, 5],
    'subsample': [0.6, 0.8, 1.0],
    'colsample_bytree': [0.6, 0.8, 1.0],
    'max_depth': [5, 10, 25, 50],
    'learning_rate': [0.0001, 0.001, 0.1, 1],
    'n_estimators': [50, 100, 250, 500],
    'reg_alpha': [0.0001, 0.001, 0.1, 1],
    'reg_lambda': [0.0001, 0.001, 0.1, 1]
    }

    fit_params={"early_stopping_rounds":5,
                "eval_metric" : "aucpr", 
                "eval_set" : [[X_val, y_val]]
               }

        random_search = RandomizedSearchCV(xgb, 
                                           cv=folds,
                                           param_distributions=params, 
                                           n_iter=param_comb, 
                                           scoring=make_scorer(auc_precision_recall_curve, needs_proba=True), 
                                           n_jobs=10,
                                           verbose=10, 
                                           random_state=1001,
                                          )

random_search.fit(X_train, y_train, **fit_params)

user39756 user39756 · Accepted Answer · 2020-02-04T00:01:54

It seems AUCPR maximize does not work for sklearn

https://github.com/dmlc/xgboost/issues/3712

RandomizedSearchCV & XGBoost + with Early Stopping

1 Answers