CNN - how can I get correctly val_loss and val_acc using keras?

Question

Microscopy images are on .tif format and have the following specifications:

Color model: R(ed)G(reen)B(lue)
Size: 2048 x 1536 pixels
Pixel scale: 0.42 μm x 0.42 μm
Memory space: 10-20 MB (approx.)
Type of label: image-wise
4 class: benign, invasive, in Situ, normal

CNN kodu:

from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense


classifier = Sequential()


classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

classifier.add(Flatten())


classifier.add(Dense(activation = 'relu', units = 128))
classifier.add(Dense(activation = 'softmax', units = 4))


classifier.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])


from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

training_set = train_datagen.flow_from_directory('BioImaging2015/breasthistology/Training_data',
                                                 target_size = (64, 64),
                                                 batch_size = 1,
                                                 class_mode = 'binary')

test_set = test_datagen.flow_from_directory('BioImaging2015/breasthistology/Test_data',
                                            target_size = (64, 64),
                                            batch_size = 1,
                                            class_mode = 'binary')

classifier.fit_generator(training_set,
                         samples_per_epoch = 5000,
                         nb_epoch = 20,
                         validation_data = test_set,
                         nb_val_samples = len(test_set))

data:

Found 249 images belonging to 4 classes.
Found 36 images belonging to 4 classes.

At first the test_data was in a single file. but he gave a error

Found 0 images belonging to 0 classes.

Then I have made it into 4 files.

Output:

Epoch 1/20
5000/5000 [==============================] - 1056s 211ms/step - loss: 1.3914 - acc: 0.2754 - val_loss: 1.3890 - val_acc: 0.2500
Epoch 2/20
5000/5000 [==============================] - 1056s 211ms/step - loss: 1.2874 - acc: 0.3740 - val_loss: 1.6325 - val_acc: 0.3333
Epoch 3/20
5000/5000 [==============================] - 1056s 211ms/step - loss: 0.7412 - acc: 0.7098 - val_loss: 1.4916 - val_acc: 0.4722
Epoch 4/20
5000/5000 [==============================] - 1056s 211ms/step - loss: 0.3380 - acc: 0.8780 - val_loss: 1.4263 - val_acc: 0.5278
Epoch 5/20
5000/5000 [==============================] - 1057s 211ms/step - loss: 0.1912 - acc: 0.9346 - val_loss: 2.1176 - val_acc: 0.4722
Epoch 6/20
5000/5000 [==============================] - 1103s 221ms/step - loss: 0.1296 - acc: 0.9568 - val_loss: 2.8661 - val_acc: 0.4167
Epoch 7/20
5000/5000 [==============================] - 1182s 236ms/step - loss: 0.0964 - acc: 0.9698 - val_loss: 3.5154 - val_acc: 0.3611
Epoch 8/20
5000/5000 [==============================] - 1245s 249ms/step - loss: 0.0757 - acc: 0.9790 - val_loss: 3.6839 - val_acc: 0.3889
Epoch 9/20
3540/5000 [====================>.........] - ETA: 5:54 - loss: 0.0664 - acc: 0.9819

Here is my understanding:

The loss is decresing and the acc is increasing. So this indicates the modeling is trained in a good way.

My Questions are:

The val_acc is decresing and val_loss increasing. why? This is overfitting? if I write dropout,acc and val_acc are not increase. two loss are not decrease.
After 9 epoches, the acc is still increasing. So should I use more epoches and stop when acc stops increasing? Or I should stop where val_acc stops increasing? But val_acc is not increase.
Is the cnn network correct?I can't see where the problem is.

changes:

loss = 'sparse_categorical_crossentropy' -> loss = 'categorical_crossentropy'
class_mode = 'binary' -> class_mode = 'categorical'

output2:

Epoch 1/20
5000/5000 [==============================] - 1009s 202ms/step - loss: 1.3878 - acc: 0.2752 - val_loss: 1.3893 - val_acc: 0.2500
Epoch 2/20
5000/5000 [==============================] - 1089s 218ms/step - loss: 1.3844 - acc: 0.2774 - val_loss: 1.3895 - val_acc: 0.2500
Epoch 3/20
5000/5000 [==============================] - 1045s 209ms/step - loss: 1.3847 - acc: 0.2764 - val_loss: 1.3894 - val_acc: 0.2500
Epoch 4/20
5000/5000 [==============================] - 1077s 215ms/step - loss: 1.3843 - acc: 0.2764 - val_loss: 1.3885 - val_acc: 0.2500
Epoch 5/20
5000/5000 [==============================] - 1051s 210ms/step - loss: 1.3841 - acc: 0.2768 - val_loss: 1.3887 - val_acc: 0.2500
Epoch 6/20
5000/5000 [==============================] - 1050s 210ms/step - loss: 1.3841 - acc: 0.2782 - val_loss: 1.3891 - val_acc: 0.2500
Epoch 7/20
5000/5000 [==============================] - 1053s 211ms/step - loss: 1.3836 - acc: 0.2780 - val_loss: 1.3900 - val_acc: 0.2500

You have the telltale signature of overfitting: after epoch #4, your validation error starts increasing, while your training error continues to decrease. Most probably, this is due to your very small dataset... — desertnaut

sebrockm sebrockm · Accepted Answer · 2019-02-27T22:01:26

As you have four classes and softmax activation on the last layer, it seems very unlikely to me that your choices of class_mode='binary' for flow_from_directory() and loss='sparse_categorical_crossentropy' for classifier.compile() are correct. The labels generated this way won't make sense.

class_mode='binary' will generate labels in the form [0,1,1,0,1,1,...] which only make sense fore a yes/no-prediction (hence "binary") while loss='sparse_categorical_crossentropy' expects labels in the form [1,3,2,4,3,2,1,2,...] (one integer for each class).

Try class_mode='categorical' and loss='categorical_crossentropy' instead. This will generate one-hot-encoded labels, e.g.

[[0,0,1,0],
 [0,1,0,0],
 [0,0,0,1],
 ...      ]

which is exactly what loss='categorical_crossentropy' expects to get. Also the choice of activation='softmax' in the last layer is perfectly suited for this, as it makes sure that the four values in the last layer always sum ap to 1.

Regarding your questions:

Yes, you are very likely facing overfitting due to incorrect labels (they don't make sense). Your model is basically learning random labels (the training data) and thus is not working well on other random labels (the validation data).
You should stop when val_acc stops increasing. Yes, in your case this point is reached already after two epochs. But for good models this is a common practice. Your understanding has a flaw: Performing great on training data is not the goal! Remember, in the end, you want your model to predict pictures it has never seen before, so only validation data is telling you the truth. (Actually, it's even better to have yet another test dataset that has never been touched during training and to evaluate against that one at the very end after calling fit or fit_generator.
The actual network is correct, just your data isn't (as explained above). However, if after my suggested fix it still performs badly, you will need to experiment with higher numbers of features in you convolutional layers and also with adding even more convolutional layers. Dropout after middle layers with a rate of 0.2 to 0.5 is always a good way to avoid overfitting. You will need to experiment with these settings.

CNN - how can I get correctly val_loss and val_acc using keras?

1 Answers