I am using model.fit()
several times, each time is responsible for training a block of layers where other layers are freezed
CODE
# create the base pre-trained model
base_model = efn.EfficientNetB0(input_tensor=input_tensor,weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# add a fully-connected layer
x = Dense(x.shape[1], activation='relu',name='first_dense')(x)
x=Dropout(0.5)(x)
x = Dense(x.shape[1], activation='relu',name='output')(x)
x=Dropout(0.5)(x)
no_classes=10
predictions = Dense(no_classes, activation='softmax')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional layers
for layer in base_model.layers:
layer.trainable = False
#FIRST COMPILE
model.compile(optimizer='Adam', loss=loss_function,
metrics=['accuracy'])
#FIRST FIT
model.fit(features[train], labels[train],
batch_size=batch_size,
epochs=top_epoch,
verbose=verbosity,
validation_split=validation_split)
# Generate generalization metrics
scores = model.evaluate(features[test], labels[test], verbose=1)
print(scores)
#Let all layers be trainable
for layer in model.layers:
layer.trainable = True
from tensorflow.keras.optimizers import SGD
#FIRST COMPILE
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss=loss_function,
metrics=['accuracy'])
#SECOND FIT
model.fit(features[train], labels[train],
batch_size=batch_size,
epochs=no_epochs,
verbose=verbosity,
validation_split=validation_split)
What is weird is that in the second fit, accuracy resulted from first epoch is much lower that the accuracy of the last epoch of the first fit.
RESULT
Epoch 40/40 6286/6286 [==============================] - 14s 2ms/sample - loss: 0.2370 - accuracy: 0.9211 - val_loss: 1.3579 - val_accuracy: 0.6762 874/874 [==============================] - 2s 2ms/sample - loss: 0.4122 - accuracy: 0.8764
Train on 6286 samples, validate on 1572 samples Epoch 1/40 6286/6286 [==============================] - 60s 9ms/sample - loss: 5.9343 - accuracy: 0.5655 - val_loss: 2.4981 - val_accuracy: 0.5115
I think the weights of the second fit are not taken from the first fit
Thanks in advance!!!