1
votes

I am trying to determine roc_auc_score for a fit model on a validation set.

I am seeing some conflicting information on function inputs.

Documentation says:

"y_score array-like of shape (n_samples,) or (n_samples, n_classes) Target scores. In the binary and multilabel cases, these can be either probability estimates or non-thresholded decision values (as returned by decision_function on some classifiers). In the multiclass case, these must be probability estimates which sum to 1. The binary case expects a shape (n_samples,), and the scores must be the scores of the class with the greater label. The multiclass and multilabel cases expect a shape (n_samples, n_classes). In the multiclass case, the order of the class scores must correspond to the order of labels, if provided, or else to the numerical or lexicographical order of the labels in y_true."

Not sure exactly what this calls for: 1) predicted probabilities against the actual y values in the test set or 2) class predictions against the actual y values in the test set

I've been searching and, in the binary classification case (my interest), some people use predicted probabilities while others use actual predictions (0 or 1). In other words:

Fit model:

model.fit(X_train, y_train)

Use either:

y_preds = model.predict(X_test)

or:

y_probas = model.predict_proba(X_test)

I find that:

roc_auc_score(y_test, y_preds)

and:

roc_auc_score(y_test, y_probas[:,1]) # probabilites for the 1 class

yield very different results.

Which one is correct?

I also find that to actually plot the ROC Curve I need to use probabilities.

Any guidance appreciated.

1

1 Answers

4
votes

model.predict(...) will give you the predicted label for each observation. That is, it will return an array full of ones and zeros.

model.predict_proba(...)[:, 1] will give you the probability for each observation being equal to one. That is, it will return an array full of numbers between zero and one, inclusive.

A ROC curve is calculated by taking each possible probability, using it as a threshold and calculating the resulting True Positive and False Positive rates. Hence, if you pass model.predict(...) to metrics.roc_auc_score(), you are calculating the AUC for a ROC curve that only used two thresholds (either one or zero). This is incorrect, as these are not the predicted probabilities of your model.

To get the AUC of your model, you need to pass the predicted probabilities to roc_auc_score(...):

from sklearn.metrics import roc_auc_score
roc_auc_score(y_test, model.predict_proba(X_test)[:, 1])