Is there any way to use the EarlyStopping Keras function when I apply a stateful LSTM and I reset the states?

Question

I am using a stateful LSTM regression model and I would like to apply the EarlyStopping function. In stateful LSTMs as I was reading the states should be reset at each epoch. However, I noticed that when I was resetting the states the EarlyStopping method wasn't working at all. I attach also the code.


model = Sequential()
model.add(LSTM(256, batch_input_shape=(batch_size, timesteps, features), return_sequences=False, stateful=True))
model.add(Dropout(rate=0.2))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error', optimizer='adam')
mc = ModelCheckpoint('best_model.h5', monitor='val_loss', mode='min', verbose=0, save_best_only=True)
es = EarlyStopping(monitor='val_loss', mode='min', patience=1, restore_best_weights=True, verbose=1)

for epoch in range(epochs):
    print("Epoch: ", epoch + 1)

    hist = model.fit(train_x, train_y, epochs=1, batch_size, shuffle=False,
                     validation_data=(validation_x, validation_y), verbose=2, callbacks=[mc, es])
    model.reset_states()

If I run the above code without the for loop and without the reset of states then the EarlyStopping works fine. Is there any way to apply the EarlyStopping in a for loop?

Thank you in advance

Nestoras Chalkidis Nestoras Chalkidis · Accepted Answer · 2021-02-28T20:09:56

It seems that EarlyStopping could not be applied when the number of epochs=1 in the model.fit() function. As I understand this happens because EarlyStopping is applied for one epoch each time and can work only if the number of epochs in model.fit() is higher than 1.

I used the following code to save the best model and halt the training process after some epochs.

# Number of epochs to wait before halting the training process
patience = 50

# Store the metrics of each epoch to a pandas dataframe
history = pd.DataFrame()

# Define a high loss value (this may change based on the classification problem that you have)
min_loss = 2.00

# Define a minimum accuracy value
min_acc = 0.25

# Initialize the wait variable        
wait = 0

for epoch in range(epochs):
    print("Epoch: ", epoch + 1)

    hist = model.fit(train_x, train_y, epochs=1, batch_size, shuffle=False,
                     validation_data=(validation_x, validation_y), verbose=2)
    model.reset_states()

    if epoch >= 0:
         if np.isnan(hist.history['val_loss'][0]):
                    break
                else:
                    if round(hist.history['val_loss'][0], 4) < min_loss:
                        min_loss = round(hist.history['val_loss'][0], 4)
                        min_acc = hist.history['val_accuracy'][0]
                        model.save('best_model')
                        history.loc[epoch, 'epoch'] = epoch + 1
                        history.loc[epoch, 'loss'] = hist.history['loss'][0]
                        history.loc[epoch, 'val_loss'] = hist.history['val_loss'][0]
                        history.loc[epoch, 'accuracy'] = hist.history['accuracy'][0]
                        history.loc[epoch, 'val_accuracy'] = hist.history['val_accuracy'][0]
                        wait = 0
                    else:
                        wait += 1
                        print('*' * 50)
                        print(f"Patience: {wait}/ {patience}", "-", "Current best val_accuracy:",
                              '{0:.5}'.format(min_acc),
                              "with loss:", '{0:.5}'.format(min_loss), f"at epoch {epoch - wait}")
                        print('*' * 50)

                        if wait < patience:
                            history.loc[epoch, 'epoch'] = epoch + 1
                            history.loc[epoch, 'loss'] = hist.history['loss'][0]
                            history.loc[epoch, 'val_loss'] = hist.history['val_loss'][0]
                            history.loc[epoch, 'accuracy'] = hist.history['accuracy'][0]
                            history.loc[epoch, 'val_accuracy'] = hist.history['val_accuracy'][0]

                        else:
                            break

history.to_csv('history.csv', header=True, index=False)

Is there any way to use the EarlyStopping Keras function when I apply a stateful LSTM and I reset the states?

3 Answers