3
votes

I am currently playing around with a toy example regarding hyperparameter optimization in xgboost. In the following example I go through the following steps:

  1. Load the iris dataset form sklearn and split it into train and test set.
  2. Declare a parameter grid I'd like to explore.
  3. Given the multi-label clasisfication nature of the problem, I'd like to evaluate my model based on f1 score. Now, to do that I declare a xgb_f1 method (given f1 score is not among the default evaluation metrics in xgboost) to align the algorithm target metric to the one of the cross-validaton.
  4. Instantiate and fit a RandomizedSearchCV using f1_macro as my scoring function (same as the classifier).

Now, when fitting the search, the following message pops out among the training instances:

Multiple eval metrics have been passed: 'validation_0-f1' will be used for early stopping.

Everything seems to be trained smoothly, but why merror doesn't get overridden by eval_metric and gets computed anyway on my eval set?

Also, as far as I can tell from xgboost documentation the algorithm works by minimizing a given target metric by default, should I change this behavior given f1 score will be used?

Full working example

import xgboost as xgb
from sklearn.model_selection import train_test_split, RandomizedSearchCV
from sklearn.metrics import f1_score
from sklearn.datasets import load_iris
import numpy as np

data = load_iris()
x = data.data
y = data.target
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.33)

param_grid = {
    "n_estimators": [100, 200, 300, 500, 600, 800],
    "max_depth":        [2, 4, 8, 16, 32, 70, 100, 150],
    "min_child_weight": [1],
    "subsample":        [1]
}


def xgb_f1(y, t, threshold=0.5):
    t = t.get_label()
    y_bin = (y > threshold).astype(int)
    y_bin = np.argmax(y_bin, axis=1)
    return "f1", f1_score(t, y_bin, average="macro")


fit_params = {
    "early_stopping_rounds": 42,
    "eval_set": [[x_test, y_test]],
    "eval_metric": xgb_f1
}

clf = xgb.XGBClassifier(objective="multi:softmax")
grid = RandomizedSearchCV(clf, param_grid, n_jobs=-1, cv=2, verbose=1, scoring="f1_macro")
grid.fit(x_train, y_train, **fit_params, verbose=True)
print(f"Best f1-score: {grid.best_score_}")
print(f"best params: {grid.best_params_}")

1

1 Answers

0
votes

Tried using "disable_default_eval_metric": 1 in your parameters?