XGBoost produces non-binary predictions

Question

After training my model with XGBoost, I tried to test the model but the predictions are some sorts of floating point numbers which cause error when I want to get performacne measures. This is the code:

import xgboost as xgb
import sklearn.metrics as mt 

xg_reg = xgb.XGBRegressor(objective ='reg:linear', colsample_bytree = 0.3, learning_rate = 0.1,
            max_depth = 5, alpha = 10, n_estimators = 10)
xg_reg.fit(X_train,Y_train)
y_pred = xg_reg.predict(X_test)
mt.f1_score(Y_test, y_pred)

And this is the error:

ValueError: Target is multiclass but average='binary'. Please choose another average setting.

This had never happened when I used other boosting models such as AdaBoost or CatBoost. Should I consider a threshold and assign +1 to those above the threshold, and -1 to those below the threshold? Any kind of advice is appreciated.

@desertnaut, yeah, quite helpful! I should replace the regression model with a classifier — amiref

desertnaut desertnaut · Accepted Answer · 2019-02-27T13:11:53

Assuming that you are in a binary classification setting, as you clearly imply, the issue is that you should not use XGBRegressor, which is for regression problems and not for classification ones; from the docs (empasis added):

class xgboost.XGBRegressor

Implementation of the scikit-learn API for XGBoost regression

You should use XGBClassifier instead.

For more details, see own answer in Accuracy Score ValueError: Can't Handle mix of binary and continuous target (caution because practically all other answers there, including the accepted & highly upvoted one, are essentially wrong); it is about a practically identical issue with scikit-learn, but the same arguments hold for your case as well.

XGBoost produces non-binary predictions

1 Answers