0
votes

If I have a tensorflow model using a custom estimator, how would I save the model so that I can deploy it for production.

https://colab.research.google.com/github/google-research/bert/blob/master/predicting_movie_reviews_with_bert_on_tf_hub.ipynb#scrollTo=JIhejfpyJ8Bx

The model I'm using is similar to this one and was wondering how to save the model once its been trained. Have tried using Savedmodel and restoring using checkpoints and have been unsuccessful with both (was unable to adapt it for this example)

1

1 Answers

0
votes

One way to do this is via gRPC. TF has some not so straightforward documentation there: https://www.tensorflow.org/tfx/serving/serving_basic The hardest bit is actually saving your model, afterwards hosting it via docker has quite a bit documentation. Finally, you can infer on it using a gRPC client, i.e https://github.com/epigramai/tfserving-python-predict-client

To do this, you need to save your model first. Something like this, where you will need to tweak it for your example a bit:

  def save_serving_model(self,estimator):
      feature_placeholder = {'sentence': tf.placeholder('string', [1], name='sentence_placeholder')}

      # The build_raw_serving_input_receiver_fn doesn't serialize inputs so avoids confusion with bytes and strings. You can simply pass a string.
      serving_input_fn = tf.estimator.export.build_raw_serving_input_receiver_fn(feature_placeholder)

      # Save the model
      estimator.export_savedmodel("./TEST_Dir", serving_input_fn)

This will save a model in the TEST_Dir. As a quick test you can do:

saved_model_cli run --dir /path/to/mode/ --tag_set serve --signature_def predict --input_exprs="sentence=['This API is a little tricky']"

The next step is hosting this model, or "serving" it. The way I do this is via docker, i.e. a command like

docker run -p 8500:8500 \
--mount type=bind,source=/tmp/mnist,target=/models/mnist \
-e MODEL_NAME=mnist -t tensorflow/serving &

Finally, you can use the predict client (via gRPC) to pass a sentence to your server and return the result. The github link I added above has two blog posts regarding that.