0
votes

I am using a tensorflow custom estimator with AdamOptimizer, so my model_fn looks like this :

def model_fn(features, labels, mode, params):
  ...
  loss = ...
  train_op = tf.train.AdamOptimizer(params['learning_rate']).minimize(loss, tf.train.get_global_step())
  if mode == tf.estimator.ModeKeys.TRAIN:
    return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
  elif mode == tf.estimator.ModeKeys.EVAL:
    return tf.estimator.EstimatorSpec(mode, loss=loss)
  elif mode == tf.estimator.ModeKeys.PREDICT:
    return tf.estimator.EstimatorSpec(mode, predictions=predictions)

I want to implement an early stopping mechanism. To simplify, I am doing the following:

for epoch in range(1000):
  model.train(input_fn=input_fn, steps=self.steps_by_epoch)
  loss = model.evaluate(input_fn=eval_input_fn)['loss']
  if loss < 0.001:
    break

So model.train will be called in a loop, and will do an epoch of data at each call.

My question is : the learning rate in AdamOptimizer (and many other optimizers) is a state variable that is supposed to evolve during the minimization. Will its value be saved between two calls to model.train or will it be reinitialized at every call?

And if the latter, how can I make Tensorflow remember that variable between two calls to model.train

1

1 Answers

1
votes

After every call to model.train(), the model state will be saved in a checkpoint. Since all the parameters used in Adam optimizer are also variables of the graph of tensorflow, they will also be saved in the checkpoint and hence will be retrieved for next call to model.train().

Also you should look into tf.estimator.train_and_evaluate (https://www.tensorflow.org/api_docs/python/tf/estimator/train_and_evaluate).

This will automatically do evaluation whenever a checkpoint is saved. You can control checkpoint and evaluation frequency using specs.