One way to do this is via gRPC.
TF has some not so straightforward documentation there: https://www.tensorflow.org/tfx/serving/serving_basic
The hardest bit is actually saving your model, afterwards hosting it via docker has quite a bit documentation.
Finally, you can infer on it using a gRPC client, i.e https://github.com/epigramai/tfserving-python-predict-client
To do this, you need to save your model first. Something like this, where you will need to tweak it for your example a bit:
def save_serving_model(self,estimator):
feature_placeholder = {'sentence': tf.placeholder('string', [1], name='sentence_placeholder')}
# The build_raw_serving_input_receiver_fn doesn't serialize inputs so avoids confusion with bytes and strings. You can simply pass a string.
serving_input_fn = tf.estimator.export.build_raw_serving_input_receiver_fn(feature_placeholder)
# Save the model
estimator.export_savedmodel("./TEST_Dir", serving_input_fn)
This will save a model in the TEST_Dir.
As a quick test you can do:
saved_model_cli run --dir /path/to/mode/ --tag_set serve --signature_def predict --input_exprs="sentence=['This API is a little tricky']"
The next step is hosting this model, or "serving" it. The way I do this is via docker, i.e. a command like
docker run -p 8500:8500 \
--mount type=bind,source=/tmp/mnist,target=/models/mnist \
-e MODEL_NAME=mnist -t tensorflow/serving &
Finally, you can use the predict client (via gRPC) to pass a sentence to your server and return the result. The github link I added above has two blog posts regarding that.