0
votes

My CNN code in keras is as follows:

from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout

classifier = Sequential()
#1st Conv layer
classifier.add(Convolution2D(64, (9, 9), input_shape=(64, 64, 3), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(4,4)))
#2nd Conv layer
classifier.add(Convolution2D(32, (3, 3), activation='relu'))
classifier.add(MaxPooling2D(pool_size=(2,2)))

#Flattening
classifier.add(Flatten())

# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dropout(0.1))
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dropout(0.2))
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 2, activation = 'softmax'))

classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

#Fitting dataset

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('dataset/training_set',
                                                 target_size = (64, 64),
                                                 batch_size = 32,
                                                 class_mode = 'categorical')

test_set = test_datagen.flow_from_directory('dataset/test_set',
                                            target_size = (64, 64),
                                            batch_size = 32,
                                            class_mode = 'categorical')

classifier.fit_generator(
        training_set,
        steps_per_epoch=(1341+3875)/32,
        epochs=15,
        validation_data=test_set,
        validation_steps=(234+390)/32)

Wherever I see the use of roc_curve from sklearn.metrics, it takes parameters like x_train, y_train, x_test, y_test which I know can be pandas DataFrames but in my case it is not the case. How do I plot the ROC curve and get AUC score for model training for CNNs like here?

2

2 Answers

1
votes

Actually if look at the docs of sklearn.metrics.roc_curve (and almost for every sklearn metric) they don't take the inputs of your model (images) as arguments, it just takes the true labels and the predicted label. So after you make the inference on the test set, which in keras (here i just guessing) is something like

preds = classifier.predict(batch)

You call roc_curve as

fpr, tpr = roc_curve(true_labels,preds)

Probablly you have to change the type though, beacuse they're are tensor.

EDIT : I've checked the keras documentation on flow_from_directory and yields an iterator over (x,y) = (images,labels) so if you want to do some kind of post-training analysis you should get the labels using something like this:

labels = []
for _,y in test_set:
    labels.extend(list(y))

And if you only have two classes, change the class_mode to binary

0
votes

I got it working. All I had to do was match the datatype of preds obtained from preds = classifier.predict(test_set) with the true_labels I got from labels = test_set. Preds is basically a numpy.ndarray containing single element lists which have np.float32 values. Conversion of labels to that same format and shape got the roc_curve working.

Also, I had to add a third variable threshold in fpr, tpr, threshold = roc_curve(true_labels, preds) so no ValueError: too many values to unpack error popped up.