I am trying to implement xgboost
on a classification data with imbalanced classes (1% of ones and 99% zeroes).
I am using binary:logistic
as the objective function for classification.
According to my knowledge on xgboost
- As the boosting starts building trees, the objective function is optimized iteratively achieving best performance at the end when all the trees are combined.
In my data due to imbalance in the classes, I am facing the problem of Accuracy Paradox. Where at the end of the model I am able to achieve great accuracy
but poor precision
and recall
I wanted a custom objective function that can optimize the model and returns a final xgboost model with best f-score
. Or can I use any other objective functions that can return in best f-score
?
Where F-Score = (2 * Precision * Recall)/(Precision + Recall)