4
votes

I am trying to use 'AUCPR' as evaluation criteria for early-stopping using Sklearn's RandomSearchCV & Xgboost but I am unable to specify maximize=True for early stopping fit params. Instead the eval_metric minimizes for AUCPR.

I have already referred to this question: GridSearchCV - XGBoost - Early Stopping

But it seems the early stopping works only for minimization objectives? The best iteration in early stopping is considered when AUCPR is the lowest which is not the correct optimization.

    xgb = XGBClassifier()


    params = {
    'min_child_weight': [0.1, 1, 5, 10, 50],
    'gamma': [0.5, 1, 1.5, 2, 5],
    'subsample': [0.6, 0.8, 1.0],
    'colsample_bytree': [0.6, 0.8, 1.0],
    'max_depth': [5, 10, 25, 50],
    'learning_rate': [0.0001, 0.001, 0.1, 1],
    'n_estimators': [50, 100, 250, 500],
    'reg_alpha': [0.0001, 0.001, 0.1, 1],
    'reg_lambda': [0.0001, 0.001, 0.1, 1]
    }

    fit_params={"early_stopping_rounds":5,
                "eval_metric" : "aucpr", 
                "eval_set" : [[X_val, y_val]]
               }

        random_search = RandomizedSearchCV(xgb, 
                                           cv=folds,
                                           param_distributions=params, 
                                           n_iter=param_comb, 
                                           scoring=make_scorer(auc_precision_recall_curve, needs_proba=True), 
                                           n_jobs=10,
                                           verbose=10, 
                                           random_state=1001,
                                          )

random_search.fit(X_train, y_train, **fit_params)
1

1 Answers

0
votes

It seems AUCPR maximize does not work for sklearn

https://github.com/dmlc/xgboost/issues/3712