Why is my model overfitting on the second epoch?

Question

I'm a beginner in deep learning and I'm trying to train a deep learning model to classify different ASL hand signs using Mobilenet_v2 and Inception.

Here are my codes create an ImageDataGenerator for creating the training and validation set.

# Reformat Images and Create Batches

IMAGE_RES = 224
BATCH_SIZE = 32

datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    validation_split = 0.4
)

train_generator = datagen.flow_from_directory(
    base_dir,
    target_size = (IMAGE_RES,IMAGE_RES),
    batch_size = BATCH_SIZE,
    subset = 'training'
)

val_generator = datagen.flow_from_directory(
    base_dir,
    target_size= (IMAGE_RES, IMAGE_RES),
    batch_size = BATCH_SIZE,
    subset = 'validation'
)

Here are the codes to train the models:

# Do transfer learning with Tensorflow Hub
URL = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4"
feature_extractor = hub.KerasLayer(URL,
                                   input_shape=(IMAGE_RES, IMAGE_RES, 3))
# Freeze pre-trained model
feature_extractor.trainable = False

# Attach a classification head
model = tf.keras.Sequential([
  feature_extractor,
  layers.Dense(5, activation='softmax')
])

model.summary()

# Train the model
model.compile(
  optimizer='adam',
  loss='categorical_crossentropy',
  metrics=['accuracy'])

EPOCHS = 5

history = model.fit(train_generator,
                    steps_per_epoch=len(train_generator),
                    epochs=EPOCHS,
                    validation_data = val_generator,
                     validation_steps=len(val_generator)
                    )

Epoch 1/5 94/94 [==============================] - 19s 199ms/step - loss: 0.7333 - accuracy: 0.7730 - val_loss: 0.6276 - val_accuracy: 0.7705

Epoch 2/5 94/94 [==============================] - 18s 190ms/step - loss: 0.1574 - accuracy: 0.9893 - val_loss: 0.5118 - val_accuracy: 0.8145

Epoch 3/5 94/94 [==============================] - 18s 191ms/step - loss: 0.0783 - accuracy: 0.9980 - val_loss: 0.4850 - val_accuracy: 0.8235

Epoch 4/5 94/94 [==============================] - 18s 196ms/step - loss: 0.0492 - accuracy: 0.9997 - val_loss: 0.4541 - val_accuracy: 0.8395

Epoch 5/5 94/94 [==============================] - 18s 193ms/step - loss: 0.0349 - accuracy: 0.9997 - val_loss: 0.4590 - val_accuracy: 0.8365

I've tried using data augmentation but the model still overfits so I'm wondering if I've done something wrong in my code.

What methods of data augmentation are you using? What are the sizes of your train/validation/test sets? — gmiles
Hi, I'm using 70% of my data for training and the other 30% for validation. I've tried using the following for data augmentation: rotation_range=15, width_shift_range=.1, height_shift_range=.1, horizontal_flip = True, zoom_range=0.2. — junsiong2008
After doing data augmentation and training for 10 epochs, my training accuracy is 0.9997 and val_accuracy is 0.8365. — junsiong2008

Vijeth Rai Vijeth Rai · Accepted Answer · 2020-08-14T05:10:41

Your data is very small. Try splitting with random seeds and check if the problem still persists.

If it does, then use regularizations and decrease the complexity of neural network.

Also experiment with different optimizers and smaller learning rate (try lr scheduler)

Why is my model overfitting on the second epoch?

1 Answers