3
votes

I am training an apparel classification algorithm using CNN. I have around 60000 images for training across 10 classes(split by 80:20 for training and validation). Seperate 10000 images for test.

The training accuracy improves over time but the validation accuracy remains constant. Also training loss decreases but validation loss remains same. Plot of accuracy

Plot of loss

img_width, img_height = 28, 28
batch_size = 32
samples_per_epoch = 20000
validation_steps = 300
nb_filters1 = 32
nb_filters2 = 64
nb_filters3 = 128
conv1_size = 3
conv2_size = 2
pool_size = 2
classes_num = 10
epochs = 300

#learning_rate = 0.001
learning_rate = 0.01
decay_rate = learning_rate / epochs
momentum = 0.8
sgd = SGD(lr=learning_rate, momentum=momentum, decay=decay_rate, 
     nesterov=True)

model = Sequential()
model.add(
    Convolution2D(nb_filters1, conv1_size, conv1_size, border_mode="same", 
    input_shape=(img_width, img_height, 3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(pool_size, pool_size)))

model.add(Convolution2D(nb_filters2, conv2_size, conv2_size, 
     border_mode="same"))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(pool_size, pool_size), dim_ordering='th'))

model.add(Flatten())
model.add(Dense(256))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(Dense(classes_num, activation='softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer=sgd,
              metrics=['accuracy'])

train_datagen = ImageDataGenerator(
rescale=1. / 255,
horizontal_flip=True
)

Training accuracy achieved: around 96% Validation accuracy achieved: around 92% Test Accuracy achieveed: around 87%

My Question: What can i do to improve the validation accuracy or minimize the validation loss? what changes can be done to improve it?

1

1 Answers

5
votes

The thing that you experience is called Overfitting. You may add some more Regularisation. The easiest way to do it is to add another Dropout layer.

from keras.layers import Dropout
***

    model = Sequential()
    model.add(
        Convolution2D(nb_filters1, conv1_size, conv1_size, border_mode="same", 
        input_shape=(img_width, img_height, 3)))
    model.add(Activation("relu"))
    model.add(MaxPooling2D(pool_size=(pool_size, pool_size)))
    model.add(Dropout(0.3)  # <- THIS IS ADDED
    model.add(Flatten())
    ***

0.3 is the amount of neurons that will be multiplied by 0 and thus their values will not be included in the consequent computations. You may experiment with adding other Dropout layers and changing their values. You may also add some bias to your layers, which is explained here https://keras.io/regularizers/.