tensorflow transfer learning with pre-trained model that uses batch normalization

Question

In Tensorflow guide about transfer learning, they said:

When you unfreeze a model that contains BatchNormalization layers in order to do fine-tuning, you should keep the BatchNormalization layers in inference mode by passing training=False when calling the base model.

What I understand from this is, even when I unfreeze layers, if the pre-trained model contains the BatchNormalization layer, I should set 'traininig=False' just like the code below:

resnet = ResNet50(weights='imagenet', include_top=False)
resnet.trainable = True  # unfreeze

inputs = Input(shape=(150,150,3))
x = resnet(inputs, training=False)  # because of BN
x = GlobalAveragePooling2D()(x)
x = Dropout(0.2)(x)
outputs = Dense(150,kernel_regularizer=regularizers.l2(0.005), activation='softmax')(x)

However, I got very low accuracy and learning rarely occurred whereas when I set training to True the accuracy rate was satisfied.

So, these are my questions:

Is it wrong to set training as True when it comes to model with BN?
what does 'training = False' mean? I thought it relates to back-propagation.

Thanks in advance!

陈绍伍陈绍伍 · Accepted Answer · 2021-07-06T08:25:09

There is 4 parameters in a BN layer, 2 of which are trainale scale factors, and anoter 2 are mean and std of the input feature (for this BN layer).

Therefore:

Generally, we set training=True in the training. procedure. However, when it comes to transfer learning, it's optional, that is, "True" or "False" are acceptable, where the former unfroze the BN layer while the latter uses BN layers trianed on pervious data sets.
'Training=False' means don't update "mean", "std" and scale factors of the BN layer. When testing, it's necessary to set training=False, otherwise which would cause test data leakage of the test data thus making the test accuracy unreliable.

tensorflow transfer learning with pre-trained model that uses batch normalization

1 Answers