CNN overfitting on validation set increase test set performance

Question

I'm actually using CNN to classify image. I got 16 classes and around 3000 images(very small dataset). This is an unbalance data set. I do a 60/20/20 split, with same percentage of each class in all set. I use weights regularization. I made test with data augmentation (keras augmenteur, SMOTE, ADSYN) which help to prevent overfitting

When I overfit (epoch=350, loss=2) my model perform better (70+%) accuracy (and other metrics like F1 score) than when I don't overfit (epoch=50, loss=1) accuracy is around 60%. Accuracy is for TEST set when loss is the validation set loss.

Is it really a bad thing to use the overfitted model as best model? Since performance are better on the test set?

I have run same model with another test set (which was previously on the train set) performance are still better (tried 3 different split)

EDIT: About what i have read, validation loss is not always the best metric to affirm model is overfiting. In my situation, it's better to use validation F1 score and recall, when it's start to decrease then model is probably overfiting. I still don't understand why validation loss is a bad metric for model evaluation, still training loss is used by the model to learn

How are you concluding that your model has already "overfitted"? — Supratim Haldar
I thought when validation loss increase model is overfitting.. most of tutorial says that. But it's look like it's more complex than that. I actually have similar (good) performance on validation and test set when validation loss increase. It's look like it's not a good indicator to see if model overfit in my situation, am I right? — akhetos

Chandan M S Chandan M S · Accepted Answer · 2019-05-16T12:45:22

Yes, it is a bad thing to use over fitted model as best model. By definition, the model which over fits don't really perform well in real world scenarios ie on images that are not in the training or test set.

To avoid over fitting, use image augmentation to balance and increase the number of samples to train. Also try to increase the fraction of dropout to avoid over fitting. I personally use ImageGenerator of Keras to augment the images and save it.

from keras.preprocessing.image import ImageDataGenerator,img_to_array, load_img
import glob
import numpy as np

#There are other parameters too. Check the link given at the end of the answer
datagen = ImageDataGenerator(
        brightness_range = (0.4, 0.6),
        horizontal_flip = True,
        fill_mode='nearest'
        )

for i, image_path in enumerate(glob.glob(path_to_images)):
    img = load_img(image_path)

    x = img_to_array(img)  # creating a Numpy array

    x = x.reshape((1,) + x.shape)

    i = 0
    num_of_samples_per_image_augmentation = 8

    for batch in datagen.flow(x, save_to_dir='augmented-images/preview/fist', save_prefix='fist', save_format='jpg'):
        i += 1
        if i > num_of_samples_per_image_augmentation : # 
            break

Here is the link to image augmentation parameters using Keras, https://keras.io/preprocessing/image/

Feel free to use other libraries of your comfort.

Few other methods to reduce over fitting :

1) Tweak your CNN model by adding more training parameters.

2) Reduce Fully Connected Layers.

3) Use Transfer Learning (Pre-Trained Models)

CNN overfitting on validation set increase test set performance

1 Answers