0
votes

I have a created a binary classifier in Tensorflow that will output a generator object containing predictions. I extract the predictions (e.g [0.98, 0.02]) from the object into a list, later converting this into a numpy array. I have the corresponding array of labels for these predictions. Using these two arrays I believe I should be able to plot a roc curve via:

import matplotlib.pyplot as plt
from sklearn.metrics import roc_curve

fpr, tpr, thr = roc_curve(labels, predictions[:,1])
plt.plot(fpr, tpr)
plt.show()
print(fpr)
print(tpr)
print(thr)

Where predictions[:,1] gives the positive prediction score. However, running this code leads to only a flat line and only three values for each fpr, tpr and thr: Flat line roc plot and limited function outputs.

The only theory I have as to why this is happening is because my classifier is too sure of it's predictions. Many, if not all, of the positive prediction scores are 1.0, or incredibly close to zero:

[[9.9999976e-01 2.8635742e-07]
 [3.3693312e-11 1.0000000e+00]
 [1.0000000e+00 9.8642090e-09]
 ...
 [1.0106111e-15 1.0000000e+00]
 [1.0000000e+00 1.0030269e-09]
 [8.6156778e-15 1.0000000e+00]]

According to a few sources including this stackoverflow thread and this stackoverflow thread, the very polar values of my predictions could be creating an issue for roc_curve().

Is my intuition correct? If so is there anything I can do about it to plot my roc_curve?

I've tried to include all the information I think would be relevant to this issue but if you would like any more information about my program please ask.

1

1 Answers

0
votes

ROC is generated by changing the threshold on your predictions and finding the sensitivity and specificity for each threshold. This generally means that as you increase the threshold, your sensitivity decreases but your specificity increases and it draws a picture of the overall quality of your predicted probabilities. In your case, since everything is either 0 or 1 (or very close to it) there are no meaningful thresholds to use. That's why the thr value is basically [ 1, 1, 1 ].

You can try to arbitrarily pull the values closer to 0.5 or alternatively implement your own ROC curve calculation with more tolerance for small differences.

On the other hand you might want to review your network because such result values often mean there is a problem there, maybe the labels leaked into the network somehow and therefore it produces perfect results.