2
votes

I am training a ResNet50 on Audioset2017 dataset,with tensorflow during training and validating results,my loss function fluctuating,the overall trend is going down,but I am afraid of this.

I have run with 100 epochs,with batch size=100. and have decrease and increase the learning rate,but had no effect.

want to know is my training correct,can I use this network?or it causes wrong results.Can I remove them with some tricks? these are my train and validation(eval) loss and other metrics pictures(from tensorboard).

validation mode:

train mode:

2
Are you using tf.train.exponential_decay? See the staircase parameter.Mateen Ulhaq
@Mateen Ulhaq no,I am using just batchnorm layer after conv2dcreative_sh

2 Answers

3
votes

It seems like after 12k steps, the model starts to overfit. The training loss further decreases while the validation loss (generalization error) slowly increases again. After this point, training the model only makes it worse.

In the figure below you are in the overfitting zone.

(From www.deeplearningbook.org)

You might want to reduce the model's ability to overfit on the training data by increasing regularization. For example, L2 weights regularization or dropout.

As for the oscillations. They are probably natural, given your batch size of 100.

0
votes

In a good model, you will want the graph of your loss function to go down for the validation set. The downward trend indicates that your model is generalizing to learn on previously unseen examples. The general goal of machine learning is to be able to learn some model parameters using sampled data-points that captures the learning problem and can predict on out-of-sample examples.

For the training set, a downward trend in the value of the loss indicates that the model is learning a reasonable estimate of the target output from the training examples provided. You generally want to see this downward graph as well; otherwise, it will mean that your model is under-fitting the training set and is guaranteed empirically not to do well on the validation set.

To get a brief understanding on interpreting supervised learning models, please read Supervised Machine Learning: A Conversational Guide For Executives And Practitioners