How can I optimize boosted trees on Brier score for classification?

Question

I have a binary classification task and use the xgboost package to solve it. Basically, I just use boosted trees to do so. But I am being evaluated on the Brier score, so I thought I would optimize the Brier loss function (defined as the brier score applied on top of logistic classification) which led me to define the gradient and the hessian of the Brier loss like so :

def brier(preds, dtrain):
   labels = dtrain.get_label()
   preds = 1.0 / (1.0 + np.exp(-preds))
   grad = 2*(preds-labels)*preds*(1-preds)
   hess = 2*(2*(labels+1)*preds-labels-3*preds*preds)*preds*(1-preds)
   return grad, hess

def evalerror(preds, dtrain):
   preds = 1.0 / (1.0 + np.exp(-preds))
   labels = dtrain.get_label()
   errors = (labels - preds)**2
   return 'brier-error', float(np.sum(errors)) / len(labels)

param = {'eta':0.01,
'max_depth': 6,  # the maximum depth of each tree
#'objective': 'binary:logistic',
'booster' : 'gbtree',
'eval_metric':['rmse', 'auc']}

bst = xgb.train(param,dtrain, num_boost_round=999,early_stopping_rounds=10,obj=brier, feval=evalerror,evals=[(dtrain,'train'),(dtest,'test')])

The only problem is that by doing so, I get negative values for my prediction on my test set, which suggests that the output of the xgboost model is not the logistic probability as expected. Does anyone know what I am missing here or if there is a better way to optimize the brier score?

Any help would be really appreciated!!

Thanks,

Kuba_ Kuba_ · Accepted Answer · 2020-03-30T06:19:30

I came across the same issue and investigated it a little bit. I think OP's calculations are correct and the issue here is not about using diagonal approximation instead of exact hessian as suggested by @Damodar8, as it refers to multi-class classification problem.

As pointed out here:

NOTE: when you do customized loss function, the default prediction value is margin. this may make builtin evaluation metric not function properly for example, we are doing logistic loss, the prediction is score before logistic transformation the builtin evaluation error assumes input is after logistic transformation Take this in mind when you use the customization, and maybe you need write customized evaluation function

Although the comment itself is quite hard to unravel, the bolded sentence explain OP's issue. The solution is to just use logistic transformation to bst.predict results. Full example below:

import numpy as np
import xgboost as xgb

dtrain = xgb.DMatrix('/home/kuba/Desktop/agaricus.txt.train')
dtest = xgb.DMatrix('/home/kuba/Desktop/agaricus.txt.test')

def brier(preds, dtrain):
    labels = dtrain.get_label()
    preds = 1.0 / (1.0 + np.exp(-preds))
    grad = 2*(preds-labels)*preds*(1-preds)
    hess = 2*(2*(labels+1)*preds-labels-3*preds*preds)*preds*(1-preds)
    return grad, hess

def evalerror(preds, dtrain):
    preds = 1.0 / (1.0 + np.exp(-preds))
    labels = dtrain.get_label()
    errors = (labels - preds)**2
    return 'brier-error', float(np.sum(errors)) / len(labels)

param = {'max_depth': 2, 'eta': 1, 'silent': 1}
watchlist = [(dtest, 'eval'), (dtrain, 'train')]
num_round = 2

bst = xgb.train(param, dtrain, num_round, watchlist, obj=brier, feval=evalerror)

pred = bst.predict(dtest)
pred.min(), pred.max()
# (-5.809054, 2.2280416)

prob = 1 / (1 + np.exp(-pred))
prob.min(), prob.max()
# (0.0029912924, 0.9027395)

How can I optimize boosted trees on Brier score for classification?

2 Answers