Google AI Platform Prediction error - Object Detection API models - HttpError 400 - Tensor name has inconsistent batch size

Question

I need to do remote, on-line predictions using the TensorFlow Object Detection API. I am trying to use the Google AI-Platform. When I do on-line predictions of Object Detection models on the AI Platform, I get an error similar to:

HttpError 400 Tensor name: num_proposals has inconsistent batch size: 1 expecting: 49152

When I execute predictions locally (e.g. result = model(image)), I get the desired results.

This error occurs for a variety of Object Detection models -- Mask-RCNN and MobileNet. The error occurs on Object Detection models that I have trained, and ones loaded directly from the Object Detection Model Zoo (v2). I get successful results using the same code, but a model deployed on AI Platform that is not Object Detection.

Signature Information

The model input signature-def seems to be correct:

!saved_model_cli show --dir {MODEL_DIR_GS}
!saved_model_cli show --dir {MODEL_DIR_GS} --tag_set serve 
!saved_model_cli show --dir {MODEL_DIR_GS} --tag_set serve --signature_def serving_default

gives:

The given SavedModel contains the following tag-sets:
serve
The given SavedModel MetaGraphDef contains SignatureDefs with the following keys:
SignatureDef key: "__saved_model_init_op"
SignatureDef key: "serving_default"
The given SavedModel SignatureDef contains the following input(s):
  inputs['input_tensor'] tensor_info:
      dtype: DT_UINT8
      shape: (1, -1, -1, 3)
      name: serving_default_input_tensor:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['anchors'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 4)
      name: StatefulPartitionedCall:0
  outputs['box_classifier_features'] tensor_info:
      dtype: DT_FLOAT
      shape: (300, 9, 9, 1536)
      name: StatefulPartitionedCall:1
  outputs['class_predictions_with_background'] tensor_info:
      dtype: DT_FLOAT
      shape: (300, 2)
      name: StatefulPartitionedCall:2
  outputs['detection_anchor_indices'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 100)
      name: StatefulPartitionedCall:3
  outputs['detection_boxes'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 100, 4)
      name: StatefulPartitionedCall:4
  outputs['detection_classes'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 100)
      name: StatefulPartitionedCall:5
  outputs['detection_masks'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 100, 33, 33)
      name: StatefulPartitionedCall:6
  outputs['detection_multiclass_scores'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 100, 2)
      name: StatefulPartitionedCall:7
  outputs['detection_scores'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 100)
      name: StatefulPartitionedCall:8
  outputs['final_anchors'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 300, 4)
      name: StatefulPartitionedCall:9
  outputs['image_shape'] tensor_info:
      dtype: DT_FLOAT
      shape: (4)
      name: StatefulPartitionedCall:10
  outputs['mask_predictions'] tensor_info:
      dtype: DT_FLOAT
      shape: (100, 1, 33, 33)
      name: StatefulPartitionedCall:11
  outputs['num_detections'] tensor_info:
      dtype: DT_FLOAT
      shape: (1)
      name: StatefulPartitionedCall:12
  outputs['num_proposals'] tensor_info:
      dtype: DT_FLOAT
      shape: (1)
      name: StatefulPartitionedCall:13
  outputs['proposal_boxes'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 300, 4)
      name: StatefulPartitionedCall:14
  outputs['proposal_boxes_normalized'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 300, 4)
      name: StatefulPartitionedCall:15
  outputs['raw_detection_boxes'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 300, 4)
      name: StatefulPartitionedCall:16
  outputs['raw_detection_scores'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 300, 2)
      name: StatefulPartitionedCall:17
  outputs['refined_box_encodings'] tensor_info:
      dtype: DT_FLOAT
      shape: (300, 1, 4)
      name: StatefulPartitionedCall:18
  outputs['rpn_box_encodings'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 12288, 4)
      name: StatefulPartitionedCall:19
  outputs['rpn_objectness_predictions_with_background'] tensor_info:
      dtype: DT_FLOAT
      shape: (1, 12288, 2)
      name: StatefulPartitionedCall:20
Method name is: tensorflow/serving/predict

Steps to Reproduce

Download a model from TensorFlow Model Zoo.
Deploy to AI Platform

!gcloud config set project $PROJECT
!gcloud beta ai-platform models create $MODEL --regions=us-central1

%%bash -s $PROJECT $MODEL $VERSION $MODEL_DIR_GS
gcloud ai-platform versions create $3 \
  --project $1 \
  --model $2 \
  --origin $4 \
  --runtime-version=2.1 \
  --framework=tensorflow \
  --python-version=3.7 \
  --machine-type=n1-standard-2 \
  --accelerator type=nvidia-tesla-t4

Evaluate remotely

import googleapiclient
import numpy as np
import socket

img_np = np.zeros((100, 100,3), dtype=np.uint8)
img_list = img_np.to_list()
instances = [img_list]

socket.setdefaulttimeout(600)  # set timeout to 10 minutes
service = googleapiclient.discovery.build('ml', 'v1', cache_discovery=False, )
model_version_string = 'projects/{}/models/{}/versions/{}'.format(PROJECT, MODEL, VERSION)
print(model_version_string)

response = service.projects().predict(
    name=model_version_string,
    body={'instances': instances}
).execute()

if 'error' in response:
    raise RuntimeError(response['error'])
else:    
  print(f'Success.  # keys={response.keys()}')

I get an error similar to:

HttpError: <HttpError 400 when requesting 
https://ml.googleapis.com/v1/projects/gcp_project/models/error_demo/versions/mobilenet:predict?alt=json
returned "{ "error": "Tensor name: refined_box_encodings has inconsistent batch size: 300 
expecting: 1"}}>

Additional Information

The code fails if I change the instances variable in the request body from instances = [img_list] to instances = [{'input_tensor':img_list}].
If I intentionally use an incorrect input shape (e.g. (1, 100, 100, 2) or (100, 100, 2), I get a response stating that the input shape is not correct.
The Google Cloud Storage JSON Error Code documentation states:

invalidArgument -- The value for one of fields in the request body was invalid.

If I repeat this prediction step, I get the same error message, except with different names for tensors.
If I run the process using gcloud

import json

x = {"instances":[
[
  [
    [0, 0, 0], 
    [0, 0, 0]
  ], 
  [
    [0, 0, 0], 
    [0, 0, 0]
  ]
]
]
}
with open('test.json', 'w') as f:
  json.dump(x, f)

!gcloud ai-platform predict --model $MODEL --json-request=./test.json

I get an INVALID_ARGUMENT error.

ERROR: (gcloud.ai-platform.predict) HTTP request failed. Response: {
  "error": {
    "code": 400,
    "message": "{ \"error\": \"Tensor name: anchors has inconsistent batch size: 49152 expecting: 1\" }",
    "status": "INVALID_ARGUMENT"
  }
}

I get the same error if I submit the same JSON data above using Google Cloud Console -- the Test & Use tab of the AI Platform Version Details screen, or the AI Platform Prediction JSON documentation on Method: Projects.predict

I enabled logging (both regular and console), but it gives no additional information.

I've placed the details required to reproduce in a Colab.

Thanks in advance. I've spent over a day working on this and am really stuck!

mherzog mherzog · Accepted Answer · 2021-05-04T12:29:23

Per https://github.com/tensorflow/serving/issues/1047, when the request uses the instances key, TensorFlow Serving ensures that all components of the output have the same batch size. The workaround is to use the inputs keyword.

E.g.

inputs = [img_list]
...
response = service.projects().predict(
    name=model_version_string,
    body={'inputs': inputs}

Google AI Platform Prediction error - Object Detection API models - HttpError 400 - Tensor name has inconsistent batch size

Signature Information

Steps to Reproduce

Additional Information

1 Answers