In tensorflow training and test shows different results

Question

Now i am facing a problem in tensorflow: I have a network consisting of 6 convolutional layers (each with batch normalization, and the last convolution is followed by an average pooling to make the output shape Nx1x1xC), aiming to classify one image into a category. Everything is fine during training: - training samples are about 150000 - validation samples during training are about 12000

I have trained totally 50000 iterations with mini-batch size of 6. - During training, the training loss is getting lower always (from about 2.6 at beginning to about 0.3 at iteration 50000), - and the validation accuracy is getting higher and saturated after about 40000 iterations (from 60% at beginning to 72% at iteration 50000)

BUT when I use the learned weights of iteration 50000 on the same validation samples to test, the overall accuracy comes at only about 40%. I have googled if there someone who have faced similar problems. Some said the decay of moving average in batch normalization may be the cause.

The default decay in tf.contrib.layers.batch_norm is 0.999. Then I have trained with decay of 0.9, 0.99, 0.999. the result of OA on validation samples during test are 70%, 30%, 39%. Although decay of 0.9 have the best result, it is still lower than the OA on validation during training.

I am writing to ask if anyone have such similar problems, and do you have any idea what could be the cause?

best wishes,

My guess is that due to some reason the two graphs that you are executing on differ. On thing that could lead to difference is also dropout: are you using drop out during training evaluation? Try to find an example where the two predictions differ and dig deeper to see what is going on. Are the activations the same? — Ivaylo Strandjev
the two graphs I have checked they are the same, and I do not use dropout. Still many thanks for your suggestions! — Chun Yang

ibarrond ibarrond · Accepted Answer · 2018-07-16T11:46:29

I have two suggestions:

If you aren't using a bool isTraining and passing it to the Batch Normalization layers, do so! this should be a placeholder, and set before each session (in training it will be set to true, in test/validation to false).
Check that during test/validation you don't shuffle your tet/validation dataset (there might be some kind of shuffle=True in some import/management of the batche sof your data). The first one is key, the second shouldn't make that much of a difference, but it ensures exact numberical values each time.

In tensorflow training and test shows different results

1 Answers