tensorflow CNN loss function goes up and down ( oscilating) in tensorboard,How to remove them?

Question

I am training a ResNet50 on Audioset2017 dataset,with tensorflow during training and validating results,my loss function fluctuating,the overall trend is going down,but I am afraid of this.

I have run with 100 epochs,with batch size=100. and have decrease and increase the learning rate,but had no effect.

want to know is my training correct,can I use this network?or it causes wrong results.Can I remove them with some tricks? these are my train and validation(eval) loss and other metrics pictures(from tensorboard).

validation mode:

train mode:

Are you using tf.train.exponential_decay? See the staircase parameter. — Mateen Ulhaq
@Mateen Ulhaq no,I am using just batchnorm layer after conv2d — creative_sh

Kilian Batzner Kilian Batzner · Accepted Answer · 2017-12-12T09:14:16

It seems like after 12k steps, the model starts to overfit. The training loss further decreases while the validation loss (generalization error) slowly increases again. After this point, training the model only makes it worse.

In the figure below you are in the overfitting zone.

(From www.deeplearningbook.org)

You might want to reduce the model's ability to overfit on the training data by increasing regularization. For example, L2 weights regularization or dropout.

As for the oscillations. They are probably natural, given your batch size of 100.

tensorflow CNN loss function goes up and down ( oscilating) in tensorboard,How to remove them?

2 Answers