how to reduce overfitting in neural networks?

Question

I'm working on a sound recognition project.

I have 1500 labeled sound samples of 5 classes. (300 sound samples of a duration of 2 seconds for each classe).

I'm using an online tool to calculate the MFCC coefficients (Egde impulse) (So I can not provide the code) and then I'm training a neural network.

The dataset is splitted :

80% --> a training set which is splitted 80/20 - training/validation
20% --> a test set

After 200 training cycles, the first release of my network had the (very bad) following performances :

training accuracy = 100 % / Validation accuracy = 30 %

By searching on the net and on this forum, I found method(s) to reduce overfitting :

The final performance of my last release of neural network is the following :

training accuracy = 80 % / Validation accuracy = 60 % (after 200 training cycles)

As you can see, there is still a significant difference between training accuracy and validation accuracy..

My question is how to continue to increase my validation accuracy ?

the code of my neural network :

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, InputLayer, Dropout, Conv1D, Flatten, Reshape, MaxPooling1D, BatchNormalization
from tensorflow.keras import regularizers
from tensorflow.keras.optimizers import Adam

# model architecture
model = Sequential()
model.add(InputLayer(input_shape=(X_train.shape[1], ), name='x_input'))
model.add(Reshape((int(X_train.shape[1] / 13), 13), input_shape=(X_train.shape[1], )))
model.add(Conv1D(30, kernel_size=1, activation='relu',kernel_regularizer=regularizers.l2(0.001)))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=1, padding='same'))
model.add(Conv1D(10, kernel_size=1, activation='relu',kernel_regularizer=regularizers.l2(0.001)))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=1, padding='same'))
model.add(Flatten())
model.add(Dense(classes, activation='softmax', name='y_pred'))

# this controls the learning rate
opt = Adam(lr=0.005, beta_1=0.9, beta_2=0.999)
#opt = Adadelta(learning_rate=1.0, rho=0.95)

# train the neural network
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
model.fit(X_train, Y_train, batch_size=50, epochs=200, validation_data=(X_test, Y_test), verbose=2)

Thanks you,

Regards,

Lionel

A difference in training & validation performance in itself does not signify overfitting. — desertnaut
@desertnaut, thank you for your post.Now that you've made the diagnosis, what is the solution ? Maybe I have to perform additional sound recording to increase the size of my training set ??? What do you think ? — dkk
There is never an easy answer in such questions, let alone one that fits in a SO comment :) — desertnaut

Welcome_back Welcome_back · Accepted Answer · 2020-04-15T14:18:12

Try with Stratified K fold validation this
choose different batch size {16,32,64}
Try with sigmoid
BatchNormalization()

how to reduce overfitting in neural networks?

2 Answers