Validation accuracy (val_acc) does not change over the epochs

Question

Value of val_acc does not change over the epochs.

Summary:

I'm using a pre-trained (ImageNet) VGG16 from Keras;

from keras.applications import VGG16
conv_base = VGG16(weights='imagenet', include_top=True, input_shape=(224, 224, 3))

Database from ISBI 2016 (ISIC) - which is a set of 900 images of skin lesion used for binary classification (malignant or benign) for training and validation, plus 379 images for testing -;

I use the top dense layers of VGG16 except the last one (that classifies over 1000 classes), and use a binary output with sigmoid function activation;

conv_base.layers.pop() # Remove last one
conv_base.trainable = False
model = models.Sequential()
model.add(conv_base)
model.add(layers.Dense(1, activation='sigmoid'))

Unlock the dense layers setting them to trainable;

Fetch the data, which are in two different folders, one named "malignant" and the other "benign", within the "training data" folder;

from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers

folder = 'ISBI2016_ISIC_Part3_Training_Data'
batch_size = 20

full_datagen = ImageDataGenerator(
      rescale=1./255,
      #rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      validation_split = 0.2, # 20% validation
      horizontal_flip=True)

train_generator = full_datagen.flow_from_directory( # Found 721 images belonging to 2 classes.
        folder,
        target_size=(224, 224),
        batch_size=batch_size,
        subset = 'training',
        class_mode='binary')

validation_generator = full_datagen.flow_from_directory( # Found 179 images belonging to 2 classes.
        folder,
        target_size=(224, 224),
        batch_size=batch_size,
        subset = 'validation',
        shuffle=False,
        class_mode='binary')

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=0.001), # High learning rate
              metrics=['accuracy'])

history = model.fit_generator(
       train_generator,
       steps_per_epoch=721 // batch_size+1,
       epochs=20,
       validation_data=validation_generator,
       validation_steps=180 // batch_size+1,
       )

Then I fine-tune it with 100 more epochs and lower learning rate, setting the last convolutional layer to trainable.

I've tried many things such as:

Changing the optimizer (RMSprop, Adam and SGD);

Removing the top dense layers of the pre-trained VGG16 and adding mine;

model.add(layers.Flatten())
model.add(layers.Dense(128, activation='relu')) 
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

Shuffle=True in validation_generator;
Changing batch size;
Varying the learning rate (0.001, 0.0001, 2e-5).

The results are similar to the following:

Epoch 1/100
37/37 [==============================] - 33s 900ms/step - loss: 0.6394 - acc: 0.7857 - val_loss: 0.6343 - val_acc: 0.8101
Epoch 2/100
37/37 [==============================] - 30s 819ms/step - loss: 0.6342 - acc: 0.8107 - val_loss: 0.6342 - val_acc: 0.8101
Epoch 3/100
37/37 [==============================] - 30s 822ms/step - loss: 0.6324 - acc: 0.8188 - val_loss: 0.6341 - val_acc: 0.8101
Epoch 4/100
37/37 [==============================] - 31s 840ms/step - loss: 0.6346 - acc: 0.8080 - val_loss: 0.6341 - val_acc: 0.8101
Epoch 5/100
37/37 [==============================] - 31s 833ms/step - loss: 0.6395 - acc: 0.7843 - val_loss: 0.6341 - val_acc: 0.8101
Epoch 6/100
37/37 [==============================] - 31s 829ms/step - loss: 0.6334 - acc: 0.8134 - val_loss: 0.6340 - val_acc: 0.8101
Epoch 7/100
37/37 [==============================] - 31s 834ms/step - loss: 0.6334 - acc: 0.8134 - val_loss: 0.6340 - val_acc: 0.8101
Epoch 8/100
37/37 [==============================] - 31s 829ms/step - loss: 0.6342 - acc: 0.8093 - val_loss: 0.6339 - val_acc: 0.8101
Epoch 9/100
37/37 [==============================] - 31s 849ms/step - loss: 0.6330 - acc: 0.8147 - val_loss: 0.6339 - val_acc: 0.8101
Epoch 10/100
37/37 [==============================] - 30s 812ms/step - loss: 0.6332 - acc: 0.8134 - val_loss: 0.6338 - val_acc: 0.8101
Epoch 11/100
37/37 [==============================] - 31s 839ms/step - loss: 0.6338 - acc: 0.8107 - val_loss: 0.6338 - val_acc: 0.8101
Epoch 12/100
37/37 [==============================] - 30s 807ms/step - loss: 0.6334 - acc: 0.8120 - val_loss: 0.6337 - val_acc: 0.8101
Epoch 13/100
37/37 [==============================] - 32s 852ms/step - loss: 0.6334 - acc: 0.8120 - val_loss: 0.6337 - val_acc: 0.8101
Epoch 14/100
37/37 [==============================] - 31s 826ms/step - loss: 0.6330 - acc: 0.8134 - val_loss: 0.6336 - val_acc: 0.8101
Epoch 15/100
37/37 [==============================] - 32s 854ms/step - loss: 0.6335 - acc: 0.8107 - val_loss: 0.6336 - val_acc: 0.8101

And goes on the same way, with constant val_acc = 0.8101.

When I use the test set after finishing training, the confusion matrix gives me 100% correct on benign lesions (304) and 0% on malignant, as so:

    Confusion Matrix
    [[304   0]
     [ 75   0]]

What could I be doing wrong?

Thank you.

Mark Snyder Mark Snyder · Accepted Answer · 2020-02-25T23:17:54

VGG16 was trained on RGB centered data. Your ImageDataGenerator does not enable featurewise_center, however, so you're feeding your net with raw RGB data. The VGG convolutional base can't process this to provide any meaningful information, so your net ends up universally guessing the more common class.

In general, when you see this type of problem (your net exclusively guessing the most common class), it means that there's something wrong with your data, not with the net. It can be caused by a preprocessing step like this or by a significant portion of "poisoned" anomalous training data that actively harms the training process.

Validation accuracy (val_acc) does not change over the epochs

Summary:

1 Answers