I am trying to determine roc_auc_score
for a fit model on a validation set.
I am seeing some conflicting information on function inputs.
Documentation says:
"y_score array-like of shape (n_samples,) or (n_samples, n_classes) Target scores. In the binary and multilabel cases, these can be either probability estimates or non-thresholded decision values (as returned by decision_function on some classifiers). In the multiclass case, these must be probability estimates which sum to 1. The binary case expects a shape (n_samples,), and the scores must be the scores of the class with the greater label. The multiclass and multilabel cases expect a shape (n_samples, n_classes). In the multiclass case, the order of the class scores must correspond to the order of labels, if provided, or else to the numerical or lexicographical order of the labels in y_true."
Not sure exactly what this calls for: 1) predicted probabilities against the actual y values in the test set or 2) class predictions against the actual y values in the test set
I've been searching and, in the binary classification case (my interest), some people use predicted probabilities while others use actual predictions (0 or 1). In other words:
Fit model:
model.fit(X_train, y_train)
Use either:
y_preds = model.predict(X_test)
or:
y_probas = model.predict_proba(X_test)
I find that:
roc_auc_score(y_test, y_preds)
and:
roc_auc_score(y_test, y_probas[:,1]) # probabilites for the 1 class
yield very different results.
Which one is correct?
I also find that to actually plot the ROC Curve I need to use probabilities.
Any guidance appreciated.