How to save training history on every epoch in Keras?

13

votes

I can't keep my PC running all day long, and for this I need to save training history after every epoch. For example, I have trained my model for 100 epochs in one day, and on the next day, I want to train it for another 50 epochs. I need to generate the loss vs epoch and accuracy vs epoch graphs for the whole 150 epochs. I am using fit_generator method. Is there any way to save the training history after every epoch (most probably using Callback)? I know how to save the training history after the training has ended. I am using Tensorflow backend.

pythontensorflowkeras

12

votes

Keras has the CSVLogger callback which appears to do exactly what you need; from the documentation:

Callback that streams epoch results to a CSV file.

It has an append parameter for adding to the file. Again, from the documentation:

append: Boolean. True: append if file exists (useful for continuing training). False: overwrite existing file

from keras.callbacks import CSVLogger

csv_logger = CSVLogger("model_history_log.csv", append=True)
model.fit_generator(...,callbacks=[csv_logger])

5

votes

I had a similar requirement, I went for a naive approach.

1.Python code to run for 50 Epochs:
I saved the history of the model and the model itself trained for 50 epochs. .history is used to store entire history of the trained model.

history = model.fit_generator(......) # training the model for 50 epochs
model.save("trainedmodel_50Epoch.h5") # saving the model
with open('trainHistoryOld', 'wb') as handle: # saving the history of the model
    dump(history.history, handle)

2.Python code for loading the trained model and training for another 50 epochs:

from keras.models import load_model
model = load_model('trainedmodel_50Epoch.h5')# loading model trained for 50 Epochs

hstry = model.fit_generator(......) # training the model for another 50 Epochs

model.save("trainedmodel_50Epoch.h5") # saving the model 

with open('trainHistoryOld', 'wb') as handle: # saving the history of the model trained for another 50 Epochs
    dump(hstry.history, handle)

from pickle import load
import matplotlib.pyplot as plt

with open('trainHistoryOld', 'rb') as handle: # loading old history 
    oldhstry = load(handle)

oldhstry['loss'].extend(hstry['loss'])
oldhstry['acc'].extend(hstry['acc'])
oldhstry['val_loss'].extend(hstry['val_loss'])
oldhstry['val_acc'].extend(hstry['val_acc'])

# Plotting the Accuracy vs Epoch Graph
plt.plot(oldhstry['acc'])
plt.plot(oldhstry['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

# Plotting the Loss vs Epoch Graphs
plt.plot(oldhstry['loss'])
plt.plot(oldhstry['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

You can create custom class too as mentioned in the answer provided earlier.

4

votes

To save model history you have two options.

Use keras ModelCheckPoint callback class
Create custom class

Here is how to create custom checkpoint call back class.

class CustomModelCheckPoint(keras.callbacks.Callback):
    def __init__(self,**kargs):
        super(CustomModelCheckPoint,self).__init__(**kargs)
        self.epoch_accuracy = {} # loss at given epoch
        self.epoch_loss = {} # accuracy at given epoch

    def on_epoch_begin(self,epoch, logs={}):
        # Things done on beginning of epoch. 
        return

    def on_epoch_end(self, epoch, logs={}):
        # things done on end of the epoch
        self.epoch_accuracy[epoch] = logs.get("acc")
        self.epoch_loss[epoch] = logs.get("loss")
        self.model.save_weights("name-of-model-%d.h5" %epoch) # save the model

Now to use the call back class

checkpoint = CustomModelCheckPoint()
model.fit_generator(...,callbacks=[checkpoint])

now checkpoint.epoch_accuracy dictionary contains accuracies at a given epoch and checkpoint.epoch_loss dictionary contains losses at a given epoch

How to save training history on every epoch in Keras?

3 Answers