Unable to run faster R-CNN with elastic inference and tensorflow serving - how to debug?

Question

I have found a saved_model from Tensorflow's Model Zoo. I am able to run my Faster R-CNN model locally using the following code:

image_np = np.array(Image.open('my_input.jpg'))
image = np.asarray(image_np)
input_tensor = tf.convert_to_tensor(image)
input_tensor = input_tensor[tf.newaxis,...]

model = tf.saved_model.load(os.path.join('<PATH_TO_SAVED_MODEL>'))
model = model.signatures['serving_default']

output_dict = model(input_tensor)

I wanted to try running this using Elastic Inference, and started out with this guide. I swapped out the faster r-cnn model being run by just changing the path to the saved_model when starting up tensorflow:

EI_VISIBLE_DEVICES=0 amazonei_tensorflow_model_server --model_name=f_r_cnn --model_base_path=/tmp/f_r_cnn --port=9000

Now I'm trying to run a client to talk to tensorflow serving, using the template provided:

from __future__ import print_function

import grpc
import tensorflow as tf
from PIL import Image
import numpy as np
import time
import os
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc

tf.app.flags.DEFINE_string('server', 'localhost:9000',
                           'PredictionService host:port')
tf.app.flags.DEFINE_string('image', '', 'path to image in JPEG format')
FLAGS = tf.app.flags.FLAGS

coco_classes_txt = "https://raw.githubusercontent.com/amikelive/coco-labels/master/coco-labels-paper.txt"
local_coco_classes_txt = "/tmp/coco-labels-paper.txt"
# it's a file like object and works just like a file
os.system("curl -o %s -O %s"%(local_coco_classes_txt, coco_classes_txt))
NUM_PREDICTIONS = 5
with open(local_coco_classes_txt) as f:
  classes = ["No Class"] + [line.strip() for line in f.readlines()]


def main(_):
  channel = grpc.insecure_channel(FLAGS.server)
  stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

  # Send request
  with Image.open(FLAGS.image) as f:
    f.load()
    # See prediction_service.proto for gRPC request/response details.
    data = np.asarray(f)
    data = np.expand_dims(data, axis=0)

    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'f_r_cnn'
    request.inputs['inputs'].CopyFrom(
        tf.contrib.util.make_tensor_proto(data, shape=data.shape))
    result = stub.Predict(request, 60.0)  # 10 secs timeout
    outputs = result.outputs
    detection_classes = outputs["detection_classes"]
    detection_classes = tf.make_ndarray(detection_classes)
    num_detections = int(tf.make_ndarray(outputs["num_detections"])[0])
    print("%d detection[s]" % (num_detections))
    class_label = [classes[int(x)]
                   for x in detection_classes[0][:num_detections]]
    print("SSD Prediction is ", class_label)


if __name__ == '__main__':
  tf.app.run()

While this client ran just fine with the model from the tutorial (no surprise there), it is failing when I try to get it to talk to my Faster R-CNN model with the following error:

debug_error_string = "{"created":"@1579654607.391705065","description":"Error received from peer ipv6:[::1]:9000","file":"src/core/lib/surface/call.cc","file_line":1052,"grpc_message":"Unexpected error in RPC handling","grpc_status":2}"

I googled this error and was unable to find anything useful. What is grpc_status 2? How might I find useful information to help point me in the right direction?

donnadionne donnadionne · Accepted Answer · 2020-01-22T23:12:32

"grpc_message":"Unexpected error in RPC handling","grpc_status":2}"

is an indication that the method handler on the server side handling the request has thrown an exception when being called. (we catch all exceptions and throw this generic error ). So I think you can debug by looking at your server side method handler for the RPC request.

Unable to run faster R-CNN with elastic inference and tensorflow serving - how to debug?

2 Answers