TensorFlow 1.4 moves TF Dataset to core (tf.data.Dataset
) and doc/tutorial suggest to use tf.estimator
to train models.
However, as recommended at the end of this page, the Dataset object and its iterator must be instantiated inside the input_fn
function. This means the iterations through the dataset will start over for each call to estimator.train(input_fn, steps)
. Thus, calling is with steps < number of samples in epoch, will lead to train the model on a subset of the dataset.
Thus my question. Is it possible to implement something like this with Estimator + Dataset:
for i in range(num_epochs):
# Train for some steps
estimator.train(input_fn=train_input_fn, steps=valid_freq)
validation_iterator.
# Evaluate on the validation set (steps=None, we evaluate on the full validation set)
estimator.evaluate(input_fn=valid_input_fn)
without starting training samples iterations from scratch at each call to estimator.train(input_fn=train_input_fn, steps=valid_freq)
?
For example, unlike here, instantiate the Dataset and its iterator outside input_fn
? I tried it but it does not work because then the input (from the dataset iterator) and the model (from the estimator model_fn
) are not part of the same graph.
Thanks
Related GitHub issue