6
votes

Right now we are successfully able to serve models using Tensorflow Serving. We have used following method to export the model and host it with Tensorflow Serving.

     ------------
      For exporting 
     ------------------
     from tensorflow.contrib.session_bundle import exporter

     K.set_learning_phase(0)
     export_path = ... # where to save the exported graph
     export_version = ... # version number (integer)

     saver = tf.train.Saver(sharded=True)
     model_exporter = exporter.Exporter(saver)
     signature = exporter.classification_signature(input_tensor=model.input,
                                          scores_tensor=model.output)
     model_exporter.init(sess.graph.as_graph_def(),
                default_graph_signature=signature)
     model_exporter.export(export_path, tf.constant(export_version), sess)

      --------------------------------------

      For hosting
      -----------------------------------------------

      bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_name=default --model_base_path=/serving/models

However our issue is - we want keras to be integrated with Tensorflow serving. We would like to serve the model through Tensorflow serving using Keras. The reason we would like to have that is because - in our architecture we follow couple of different ways to train our model like deeplearning4j + Keras , Tensorflow + Keras, but for serving we would like to use only one servable engine that's Tensorflow Serving. We don't see any straight forward way to achieve that. Any comments ?

Thank you.

3
I know you're asking for TF serving so I won't post this as an answer but if you want actual tooling attached to your model instead of a black box you could also look at: github.com/deeplearning4j/deeplearning4j/blob/master/… for keras (I only mention this because you use dl4j as part of your pipeline)Adam Gibson

3 Answers

18
votes

Very recently TensorFlow changed the way it exports the model, so the majority of the tutorials available on web are outdated. I honestly don't know how deeplearning4j works, but I use Keras quite often. I managed to create a simple example that I already posted on this issue in TensorFlow Serving Github.

I'm not sure whether this will help you, but I'd like to share how I did and maybe it will give you some insights. My first trial prior to creating my custom model was to use a trained model available on Keras such as VGG19. I did this as follows.

Model creation

import keras.backend as K
from keras.applications import VGG19
from keras.models import Model

# very important to do this as a first thing
K.set_learning_phase(0)

model = VGG19(include_top=True, weights='imagenet')

# The creation of a new model might be optional depending on the goal
config = model.get_config()
weights = model.get_weights()
new_model = Model.from_config(config)
new_model.set_weights(weights)

Exporting the model

from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import utils
from tensorflow.python.saved_model import tag_constants, signature_constants
from tensorflow.python.saved_model.signature_def_utils_impl import     build_signature_def, predict_signature_def
from tensorflow.contrib.session_bundle import exporter

export_path = 'folder_to_export'
builder = saved_model_builder.SavedModelBuilder(export_path)

signature = predict_signature_def(inputs={'images': new_model.input},
                                  outputs={'scores': new_model.output})

with K.get_session() as sess:
    builder.add_meta_graph_and_variables(sess=sess,
                                         tags=[tag_constants.SERVING],
                                         signature_def_map={'predict': signature})
    builder.save()

Some side notes

  • It can vary depending on Keras, TensorFlow, and TensorFlow Serving version. I used the latest ones.
  • Beware of the names of the signatures, since they should be used in the client as well.
  • When creating the client, all preprocessing steps that are needed for the model (preprocess_input() for example) must be executed. I didn't try to add such step in the graph itself as Inception client example.

With respect to serving different models within the same server, I think that something similar to the creation of a model_config_file might help you. To do so, you can create a config file similar to this:

model_config_list: {
  config: {
    name: "my_model_1",
    base_path: "/tmp/model_1",
    model_platform: "tensorflow"
  },
  config: {
     name: "my_model_2",
     base_path: "/tmp/model_2",
     model_platform: "tensorflow"
  }
}

Finally, you can run the client like this:

bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --config_file=model_config.conf
0
votes

try this script i wrote, you can convert keras models into tensorflow frozen graphs, ( i saw that some models give rise to strange behaviours when you export them without freezing the variables).

import sys
from keras.models import load_model
import tensorflow as tf
from keras import backend as K
from tensorflow.python.framework import graph_util
from tensorflow.python.framework import graph_io
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import tag_constants


K.set_learning_phase(0)
K.set_image_data_format('channels_last')

INPUT_MODEL = sys.argv[1]
NUMBER_OF_OUTPUTS = 1
OUTPUT_NODE_PREFIX = 'output_node'
OUTPUT_FOLDER= 'frozen'
OUTPUT_GRAPH = 'frozen_model.pb'
OUTPUT_SERVABLE_FOLDER = sys.argv[2]
INPUT_TENSOR = sys.argv[3]


try:
    model = load_model(INPUT_MODEL)
except ValueError as err:
    print('Please check the input saved model file')
    raise err

output = [None]*NUMBER_OF_OUTPUTS
output_node_names = [None]*NUMBER_OF_OUTPUTS
for i in range(NUMBER_OF_OUTPUTS):
    output_node_names[i] = OUTPUT_NODE_PREFIX+str(i)
    output[i] = tf.identity(model.outputs[i], name=output_node_names[i])
print('Output Tensor names: ', output_node_names)


sess = K.get_session()
try:
    frozen_graph = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), output_node_names)    
    graph_io.write_graph(frozen_graph, OUTPUT_FOLDER, OUTPUT_GRAPH, as_text=False)
    print(f'Frozen graph ready for inference/serving at {OUTPUT_FOLDER}/{OUTPUT_GRAPH}')
except:
    print('Error Occured')



builder = tf.saved_model.builder.SavedModelBuilder(OUTPUT_SERVABLE_FOLDER)

with tf.gfile.GFile(f'{OUTPUT_FOLDER}/{OUTPUT_GRAPH}', "rb") as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())

sigs = {}
OUTPUT_TENSOR = output_node_names
with tf.Session(graph=tf.Graph()) as sess:
    tf.import_graph_def(graph_def, name="")
    g = tf.get_default_graph()
    inp = g.get_tensor_by_name(INPUT_TENSOR)
    out = g.get_tensor_by_name(OUTPUT_TENSOR[0] + ':0')

    sigs[signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY] = \
        tf.saved_model.signature_def_utils.predict_signature_def(
            {"input": inp}, {"outout": out})

    builder.add_meta_graph_and_variables(sess,
                                         [tag_constants.SERVING],
                                         signature_def_map=sigs)
    try:
        builder.save()
        print(f'Model ready for deployment at {OUTPUT_SERVABLE_FOLDER}/saved_model.pb')
        print('Prediction signature : ')
        print(sigs['serving_default'])
    except:
        print('Error Occured, please checked frozen graph')
0
votes

I have recently added this blogpost that explain how to save a Keras model and serve it with Tensorflow Serving.

TL;DR: Saving an Inception3 pretrained model:

### Load a pretrained inception_v3
inception_model = keras.applications.inception_v3.InceptionV3(weights='imagenet')

# Define a destination path for the model
MODEL_EXPORT_DIR = '/tmp/inception_v3'
MODEL_VERSION = 1
MODEL_EXPORT_PATH = os.path.join(MODEL_EXPORT_DIR, str(MODEL_VERSION))

# We'll need to create an input mapping, and name each of the input tensors.
# In the inception_v3 Keras model, there is only a single input and we'll name it 'image'
input_names = ['image']
name_to_input = {name: t_input for name, t_input in zip(input_names, inception_model.inputs)}

# Save the model to the MODEL_EXPORT_PATH
# Note using 'name_to_input' mapping, the names defined here will also be used for querying the service later
tf.saved_model.simple_save(
    keras.backend.get_session(),
    MODEL_EXPORT_PATH,
    inputs=name_to_input,
    outputs={t.name: t for t in inception_model.outputs})

And then starting a TF serving Docker:

  1. Copy the saved model to the hosts' specified directory. (source=/tmp/inception_v3 in this example)

  2. Run the docker:

docker run -d -p 8501:8501 --name keras_inception_v3 --mount type=bind,source=/tmp/inception_v3,target=/models/inception_v3 -e MODEL_NAME=inception_v3 -t tensorflow/serving
  1. Verify that there's network access to the Tensorflow service. In order to get the local docker ip (172.*.*.*) for testing run:
docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' keras_inception_v3