How to find the ROC curve and AUC score of this CNN model (keras)

Question

My CNN code in keras is as follows:

from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout

classifier = Sequential()
#1st Conv layer
classifier.add(Convolution2D(64, (9, 9), input_shape=(64, 64, 3), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(4,4)))
#2nd Conv layer
classifier.add(Convolution2D(32, (3, 3), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2,2)))

#Flattening
classifier.add(Flatten())

# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dropout(0.1))
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dropout(0.2))
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 2, activation = 'softmax'))

classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

#Fitting dataset

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('dataset/training_set',
                                                 target_size = (64, 64),
                                                 batch_size = 32,
                                                 class_mode = 'categorical')

test_set = test_datagen.flow_from_directory('dataset/test_set',
                                            target_size = (64, 64),
                                            batch_size = 32,
                                            class_mode = 'categorical')

classifier.fit_generator(
        training_set,
        steps_per_epoch=(1341+3875)/32,
        epochs=15,
        validation_data=test_set,
        validation_steps=(234+390)/32)

Wherever I see the use of roc_curve from sklearn.metrics, it takes parameters like x_train, y_train, x_test, y_test which I know can be pandas DataFrames but in my case it is not the case. How do I plot the ROC curve and get AUC score for model training for CNNs like here?

umbreon29 umbreon29 · Accepted Answer · 2020-04-30T17:36:24

Actually if look at the docs of sklearn.metrics.roc_curve (and almost for every sklearn metric) they don't take the inputs of your model (images) as arguments, it just takes the true labels and the predicted label. So after you make the inference on the test set, which in keras (here i just guessing) is something like

preds = classifier.predict(batch)

You call roc_curve as

fpr, tpr = roc_curve(true_labels,preds)

Probablly you have to change the type though, beacuse they're are tensor.

EDIT : I've checked the keras documentation on flow_from_directory and yields an iterator over (x,y) = (images,labels) so if you want to do some kind of post-training analysis you should get the labels using something like this:

labels = []
for _,y in test_set:
    labels.extend(list(y))

And if you only have two classes, change the class_mode to binary

How to find the ROC curve and AUC score of this CNN model (keras)

2 Answers