0
votes

I am new to tensorflow. I have followed tensorflow serving instructions to serve models in docker container. I am able to serve the mnist and inception model by following the instructions from https://www.tensorflow.org/serving/.

The serving models are saved in following structure:

.
|-- inception-export
|   `-- 1
|       |-- saved_model.pb
|       `-- variables
|           |-- variables.data-00000-of-00001
|           `-- variables.index
`-- mnist_model
    `-- 1
        |-- saved_model.pb
        `-- variables
            |-- variables.data-00000-of-00001
            `-- variables.index

Questions:

  1. How to serve retrained models?

I am following instructions from https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0 to retrain models.

python retrain.py \
  --bottleneck_dir=bottlenecks \
  --how_many_training_steps=500 \
  --model_dir=inception \
  --summaries_dir=training_summaries/basic \
  --output_graph=retrained_graph.pb \
  --output_labels=retrained_labels.txt \
  --image_dir=flower_photos

Above command creates retrained_graph.pb along with retrained_label.txt and bottleneck directory.

How do I convert the output in the format so that retrained model can be served through Tensorflow serving server?

  1. How to serve pretrained models?

    I have looked at Object Detection Demo https://github.com/tensorflow/models/blob/master/object_detection/object_detection_tutorial.ipynb, which explains how to use an "SSD with Mobilenet" model (https://github.com/tensorflow/models/blob/master/object_detection/g3doc/detection_model_zoo.md) for Object Detection.

    The ssd_mobilenet_v1_coco_11_06_2017.tar.gz contains

    - a graph proto (graph.pbtxt)
    - a checkpoint (model.ckpt.data-00000-of-00001, model.ckpt.index, model.ckpt.meta)
    - a frozen graph proto with weights baked into the graph as constants (frozen_inference_graph.pb) 
    

    How do I convert the above files in the format so that pretrained model can be served through Tensorflow serving server?

  2. How to create client for custom model served through Tensorflow serving server?

I have followed instructions at http://fdahms.com/2017/03/05/tensorflow-serving-jvm-client/ to create custom model. The blog explains how to create custom, serve through tensorflow serving server and client to access the model. The process of creating client is NOT very clear. I want to create client in Python and Java.

Is there any better example or guide to help understand process of creating client code for the custom models served through Tensorflow serving server.

1

1 Answers

2
votes

Tensorflow Serving now support SavedModel format. If you have a retrained model, actually you don't need to use object detection. What you can do is to use the saver to restore session from a previous format retrained model and then export it again with SavedModelBuilder by which a savedmodel that tf serving can serve could be generated. Here is my another answer for a similar question.

As for client, you could reference the code below which is also the example in tf serving/example:

from grpc.beta import implementations
import tensorflow as tf

from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2

tf.app.flags.DEFINE_string('server', 'localhost:9000',
                           'PredictionService host:port')
FLAGS = tf.app.flags.FLAGS


def main(_):
  host, port = FLAGS.server.split(':')
  channel = implementations.insecure_channel(host, int(port))
  stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)

  request = predict_pb2.PredictRequest()
  request.model_spec.name = 'model_name'
  request.model_spec.signature_name = 'signature_name'
  request.inputs['input_key'].CopyFrom(
        tf.contrib.util.make_tensor_proto(your_data, shape=data_size))
  result = stub.Predict(request, 10.0)  # 10 secs timeout
  print(result.output)