How can I use R^2 as an evaluation metric when modeling?

Question

I am using Python to train an XGBoost Regressor on a 25 feature column dataset and SKlearn's GridSearchCV for parameter tuning. GridSearchCV allows you to choose your scorer with the 'scoring' parameter, and r2 is a valid option.

grid = GridSearchCV(mdl, param_grid=params, verbose=1, cv=kfold,
                n_jobs=-1, error_score='raise',scoring='r2')`

However, when I look to use r2 as my 'eval_metric' in the grid.fit() function, I don't have a great way to use r2.

grid.fit(X_train, y_train, eval_set=[(X_test, y_test)],
         eval_metric='rmse', early_stopping_rounds=150)

I have tried using sklearns built-in r2_score method, but there are a few issues. The first being, an r2 score is calculated given the y_test set against the y_pred set. And in order to have a y_pred set, we need to fit the model. So you can see I'm running into a looping issue.

I have tried a few things to get around this. The first being training the model and making predictions inside the eval_metric variable like below:

grid.fit(X_train, y_train, eval_set=[(X_test, y_test)],
         eval_metric=r2_score(y_test, mdl.predict(X_test)), early_stopping_rounds=150)

But I am given the following error: xgboost.core.XGBoostError: need to call fit beforehand

Which makes sense.

Is there some way that I can grab the current parameters that the GridSearchCV is using, create and store predictions, and then use the r2_score as the eval_metric?

My thoughts are this. The r2 score is a standard evaluation metric on a scale of 0 to 1 (1 being a perfect fit). This is a metric that if there were a way to standardize optimizing it, would have a very far reach across almost all machine learning.

There are a couple of issues, that are unclear to me: 1) Any evaluation metric is based on comparison of y_test vs y_pred. So I do not see why is that a showstopper. 2) The signature of a callable is specific to xgboost, see eval_metric documentation in here: xgboost.readthedocs.io/en/latest/python/…. 3) why do you want to have eval_metric at the first place? it is not used in optimisation but only for monitoring of performance between iterations and early stopping. — Mischa Lisovyi
As @MykhailoLisovyi said, you are not using the parameters correctly. You dont need to pass the values in metric, only the callable, and the values will be passed into them automatically at appropriate time (after fitting the model and getting the predictions in iterations). — Vivek Kumar

Krishi H Krishi H · Accepted Answer · 2018-08-22T04:28:32

As I understand, you are looking for a way to obtain the r2 score when modeling with XGBoost. The following code will provide you the r2 score as the output,

xg = xgb.XGBRegressor()
best_xgb = GridSearchCV(
    xg, param_grid=params, cv=10, verbose=0, n_jobs=-1)

scores = cross_val_score(best_xgb, X, y, scoring='r2', cv=kfold)

You can refer Scikit-learn documentation for further details on the cross_val_score function.

Hope this helps!

How can I use R^2 as an evaluation metric when modeling?

1 Answers