2
votes

I'm trying to load a pre-trained tensorflow object detection model from the Tensorflow Object Detection repo as a tf.estimator.Estimator and use it to make predictions.

I'm able to load the model and run inference using Estimator.predict(), however the output is garbage. Other methods of loading the model, e.g. as a Predictor, and running inference work fine.

Any help properly loading a model as an Estimator calling predict() would be much appreciated. My current code:

Load and prepare image

def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(list(image.getdata())).reshape((im_height, im_width, 3)).astype(np.uint8)

image_url = 'https://i.imgur.com/rRHusZq.jpg'

# Load image
response = requests.get(image_url)
image = Image.open(BytesIO(response.content))

# Format original image size
im_size_orig = np.array(list(image.size) + [1])
im_size_orig = np.expand_dims(im_size_orig, axis=0)
im_size_orig = np.int32(im_size_orig)

# Resize image
image = image.resize((np.array(image.size) / 4).astype(int))

# Format image
image_np = load_image_into_numpy_array(image)
image_np_expanded = np.expand_dims(image_np, axis=0)
image_np_expanded = np.float32(image_np_expanded)

# Stick into feature dict
x = {'image': image_np_expanded, 'true_image_shape': im_size_orig}

# Stick into input function
predict_input_fn = tf.estimator.inputs.numpy_input_fn(
    x=x,
    y=None,
    shuffle=False,
    batch_size=128,
    queue_capacity=1000,
    num_epochs=1,
    num_threads=1,
)

Side note:

train_and_eval_dict also seems to contain an input_fn for prediction

train_and_eval_dict['predict_input_fn']

However this actually returns a tf.estimator.export.ServingInputReceiver, which I'm not sure what to do with. This could potentially be the source of my problems as there's a fair bit of pre-processing involved before the model actually sees the image.

Load model as Estimator

Model downloaded from TF Model Zoo here, code to load model adapted from here.

model_dir = './pretrained_models/tensorflow/ssd_mobilenet_v1_coco_2018_01_28/'
pipeline_config_path = os.path.join(model_dir, 'pipeline.config')

config = tf.estimator.RunConfig(model_dir=model_dir)

train_and_eval_dict = model_lib.create_estimator_and_inputs(
    run_config=config,
    hparams=model_hparams.create_hparams(None),
    pipeline_config_path=pipeline_config_path,
    train_steps=None,
    sample_1_of_n_eval_examples=1,
    sample_1_of_n_eval_on_train_examples=(5))

estimator = train_and_eval_dict['estimator']

Run inference

output_dict1 = estimator.predict(predict_input_fn)

This prints out some log messages, one of which is:

INFO:tensorflow:Restoring parameters from ./pretrained_models/tensorflow/ssd_mobilenet_v1_coco_2018_01_28/model.ckpt

So it seems like pre-trained weights are getting loaded. However results look like:

Image with bad detections

Load same model as a Predictor

from tensorflow.contrib import predictor

model_dir = './pretrained_models/tensorflow/ssd_mobilenet_v1_coco_2018_01_28'
saved_model_dir = os.path.join(model_dir, 'saved_model')
predict_fn = predictor.from_saved_model(saved_model_dir)

Run inference

output_dict2 = predict_fn({'inputs': image_np_expanded})

Results look good:

enter image description here

1
Did you manage to make it work as an Estimator ?Mat
Nope, gave up and just used predictor.DavidS

1 Answers

2
votes

When you load the model as an estimator and from a checkpoint file, here is the restore function associated with ssd models. From ssd_meta_arch.py

def restore_map(self,
                  fine_tune_checkpoint_type='detection',
                  load_all_detection_checkpoint_vars=False):
    """Returns a map of variables to load from a foreign checkpoint.
    See parent class for details.
    Args:
      fine_tune_checkpoint_type: whether to restore from a full detection
        checkpoint (with compatible variable names) or to restore from a
        classification checkpoint for initialization prior to training.
        Valid values: `detection`, `classification`. Default 'detection'.
      load_all_detection_checkpoint_vars: whether to load all variables (when
         `fine_tune_checkpoint_type='detection'`). If False, only variables
         within the appropriate scopes are included. Default False.
    Returns:
      A dict mapping variable names (to load from a checkpoint) to variables in
      the model graph.
    Raises:
      ValueError: if fine_tune_checkpoint_type is neither `classification`
        nor `detection`.
    """
    if fine_tune_checkpoint_type not in ['detection', 'classification']:
      raise ValueError('Not supported fine_tune_checkpoint_type: {}'.format(
          fine_tune_checkpoint_type))

    if fine_tune_checkpoint_type == 'classification':
      return self._feature_extractor.restore_from_classification_checkpoint_fn(
          self._extract_features_scope)

    if fine_tune_checkpoint_type == 'detection':
      variables_to_restore = {}
      for variable in tf.global_variables():
        var_name = variable.op.name
        if load_all_detection_checkpoint_vars:
          variables_to_restore[var_name] = variable
        else:
          if var_name.startswith(self._extract_features_scope):
            variables_to_restore[var_name] = variable

    return variables_to_restore

As you can see even if the config file sets from_detection_checkpoint: True, only the variables in the feature extractor scope will be restored. To restore all the variables, you will have to set

load_all_detection_checkpoint_vars: True

in the config file.

So, the above situation is quite clear. When load the model as an Estimator, only the variables from feature extractor scope will be restored, and the predictors's scope weights are not restored, the estimator would obviously give random predictions.

When load the model as a predictor, all weights are loaded thus the predictions are reasonable.