I am writing neural network code in tensorflow. I made it to save variables in every 1000 epoch. So, I expect to save variables of 1001th epoch, 2001th epoch, 3001th epoch ... for different files. The code below is the save function I made.
def save(self, epoch):
model_name = "MODEL_save"
checkpoint_dir = os.path.join(model_name)
if not os.path.exists(checkpoint_dir):
os.makedirs(checkpoint_dir)
self.saver.save(self.sess, checkpoint_dir + '/model', global_step=epoch)
self.saver.save(self.sess, checkpoint_dir + '/model')
print("path for saved %s" % checkpoint_dir)
I made this code to save two times once the function is called. Because I wanted to save history of variables for every 1000 epoch by using 'global_step=epoch'. And wanted to save latest variables in the file without epoch specified. I call this function whenever the epoch condition is met like below.
for epoch in xrange(self.m_total_epoch):
.... CODE FOR NEURAL NETWORK ....
if epoch%1000 == 1 and epoch != 1:
self.save(epoch)
Assuming current epoch is 29326, I expect all the saved files in the directory from 1001, 2001, 3001 ... 29001. However, there are only partial of files from 26001, 27001, 28001, 29001. I checked it happened in other computers. It is different from what I expected. Why does it happen?