Bad model detected when deploying model on google ml engine

Question

I trained my model using google ml engine with the following configuration.

JOB_NAME=object_detection"_$(date +%m_%d_%Y_%H_%M_%S)"
echo $JOB_NAME
gcloud ml-engine jobs submit training $JOB_NAME \
        --job-dir=gs://$1/train \
        --scale-tier BASIC_GPU \
        --runtime-version 1.12 \
        --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
        --module-name object_detection.model_main \
        --region europe-west1 \
        -- \
        --model_dir=gs://$1/train \
        --pipeline_config_path=gs://$1/data/fast_rcnn_resnet101_coco.config

After training, I downloaded the latest checkpoint from GCP and exported the model using the following command :

python export_inference_graph.py --input_type encoded_image_string_tensor --pipeline_config_path training/fast_rcnn_resnet101_coco.config --trained_checkpoint_prefix training/model.ckpt-11127 --output_directory exported_graphs

My model config looks like this :

The given SavedModel SignatureDef contains the following input(s):
  inputs['inputs'] tensor_info:
      dtype: DT_UINT8
      shape: (-1, -1, -1, 3)
      name: image_tensor:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['detection_boxes'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 300, 4)
      name: detection_boxes:0
  outputs['detection_classes'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 300)
      name: detection_classes:0
  outputs['detection_features'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, -1, -1, -1, -1)
      name: detection_features:0
  outputs['detection_multiclass_scores'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 300, 2)
      name: detection_multiclass_scores:0
  outputs['detection_scores'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 300)
      name: detection_scores:0
  outputs['num_detections'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: num_detections:0
  outputs['raw_detection_boxes'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 300, 4)
      name: raw_detection_boxes:0
  outputs['raw_detection_scores'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 300, 2)
      name: raw_detection_scores:0
Method name is: tensorflow/serving/predict

After this I deploy this model on ml-engine with the following configuration :

Python version 2.7
Framework TensorFlow
Framework version 1.12.3
Runtime version 1.12
Machine type Single core CPU

I receive the following error :

Error

Create Version failed. Bad model detected with error: "Failed to load model: Loading servable: {name: default version: 1} failed: Not found: Op type not registered 'FusedBatchNormV3' in binary running on localhost. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) tf.contrib.resampler should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.\n\n (Error code: 0)"

Atralb Atralb · Accepted Answer · 2019-10-31T12:25:04

It's most likely an incompatibility of Tf versions somewhere, e.g. between the model and the runtime. Did you create your model with the version of Tf you are actually running it in ?

Many threads seem to confirm my answer :

Not found: Op type not registered 'CountExtremelyRandomStats'

Bad model deploying to GCP Cloudml

Bad model detected when deploying model on google ml engine

2 Answers