0
votes

I am using model.fit() several times, each time is responsible for training a block of layers where other layers are freezed

CODE

 # create the base pre-trained model
    base_model = efn.EfficientNetB0(input_tensor=input_tensor,weights='imagenet', include_top=False)

    # add a global spatial average pooling layer
    x = base_model.output
    x = GlobalAveragePooling2D()(x)    

    # add a fully-connected layer
    x = Dense(x.shape[1], activation='relu',name='first_dense')(x)
    x=Dropout(0.5)(x)
    x = Dense(x.shape[1], activation='relu',name='output')(x)
    x=Dropout(0.5)(x)

    no_classes=10
    predictions = Dense(no_classes, activation='softmax')(x)

    # this is the model we will train
    model = Model(inputs=base_model.input, outputs=predictions)

    # first: train only the top layers (which were randomly initialized)
    # i.e. freeze all convolutional layers
    for layer in base_model.layers:
        layer.trainable = False

    #FIRST COMPILE
    model.compile(optimizer='Adam', loss=loss_function,
                 metrics=['accuracy'])

    #FIRST FIT
    model.fit(features[train], labels[train],
              batch_size=batch_size,
              epochs=top_epoch,
              verbose=verbosity,
              validation_split=validation_split)

    # Generate generalization metrics
    scores = model.evaluate(features[test], labels[test], verbose=1)  
  
    print(scores)

     #Let all layers be trainable        
    for layer in model.layers:
        layer.trainable = True    

    from tensorflow.keras.optimizers import SGD
#FIRST COMPILE
    model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss=loss_function,
                 metrics=['accuracy'])

    #SECOND FIT
    model.fit(features[train], labels[train],
              batch_size=batch_size,
              epochs=no_epochs,
              verbose=verbosity,
              validation_split=validation_split)
    

What is weird is that in the second fit, accuracy resulted from first epoch is much lower that the accuracy of the last epoch of the first fit.

RESULT

Epoch 40/40 6286/6286 [==============================] - 14s 2ms/sample - loss: 0.2370 - accuracy: 0.9211 - val_loss: 1.3579 - val_accuracy: 0.6762 874/874 [==============================] - 2s 2ms/sample - loss: 0.4122 - accuracy: 0.8764

Train on 6286 samples, validate on 1572 samples Epoch 1/40 6286/6286 [==============================] - 60s 9ms/sample - loss: 5.9343 - accuracy: 0.5655 - val_loss: 2.4981 - val_accuracy: 0.5115


I think the weights of the second fit are not taken from the first fit

Thanks in advance!!!

2
Hmm to my knowledge model.compile should not change the weights try not compiling the second time then just do model.fit for the second time just in case. See if that worksGerry P
If you really want to use a different optimizer then you have to compile so at the end of the first fit run model,save_weights. Then compile. After you compile run model.load_weights, then do the second fit.Gerry P

2 Answers

0
votes

I think this is the result of using a different optimizer. You used Adam in the first fit and SGD in the second fit. Try using Adam in the second fit and see if it works correctly

0
votes

I solved this by removing the second compiler.