I have created a TF model that uses tf.lookup.StaticVocabularyTable for creating a vocab map inside TF Graph. It reads mapping from a text file and assigns num_oov_buckets=500. Below is a part of the code -
num_oov_buckets = 500
table_init = tf.lookup.TextFileInitializer('resmap.txt', tf.int64, 0, tf.int64, 1, delimiter=" ")
table = tf.lookup.StaticVocabularyTable(table_init, num_oov_buckets)
Using this it runs fine at the time of training and prediction.
I convert this TF model into a Tensorflow serving using below code -
from model import ModelWDN
with tf.Session() as sess:
tf.app.flags.DEFINE_string('f', '', 'kernel')
tf.app.flags.DEFINE_integer('model_version', 1, 'version number of the model.')
tf.app.flags.DEFINE_string('save_dir', '/home/abhilash', 'Saving directory.')
FLAGS = tf.app.flags.FLAGS
export_path = os.path.join(tf.compat.as_bytes(FLAGS.save_dir), tf.compat.as_bytes(str(FLAGS.model_version)))
print('Exporting trained model to', export_path)
# Creating Model object and initializing all the global variables in TF Graph.
model = ModelWDN(res_count=21663)
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
sess.run(tf.tables_initializer())
tf.train.Saver().restore(sess, os.path.join('/home/abhilash', 'wdn'))
print("Model restored.")
# SavedModel Builder Object
builder = tf.saved_model.builder.SavedModelBuilder(export_path)
# Converting Tensor to TensorInfo Objects so that they can be used in SignatureDefs
tensor_info_click_hist_str = tf.saved_model.utils.build_tensor_info(model.click_hist_str)
tensor_info_res_to_predict_str = tf.saved_model.utils.build_tensor_info(model.res_to_predict_str)
tensor_info_prob = tf.saved_model.utils.build_tensor_info(model.logits_all)
# SignatureDef
prediction_signature = (
tf.saved_model.signature_def_utils.build_signature_def(
inputs={'click_hist_str':tensor_info_click_hist_str,
'res_to_predict_str':tensor_info_res_to_predict_str},
outputs={'probs': tensor_info_prob},
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))
builder.add_meta_graph_and_variables(
sess=sess,
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map={'predict_ad_view_prob': prediction_signature},
main_op=tf.tables_initializer(),
strip_default_attrs=False,
)
# Export the model
builder.save()
print('Done exporting TF Model to SavedModel format!')
It is converted without any error and gives correct prediction when I provide any value that exists in the resmap.txt which I gave while defining tf.lookup.TextFileInitializer. Any value which doesn't exist in this map gives an error in serving when doing curl request but doesn't give any error otherwise (i.e when predicting from TF model inside session).
Curl request -
curl -X POST http://localhost:8501/v1/models/1:predict -d '{"signature_name": "predict_ad_view_prob", "inputs":{"res_to_predict_str": ["9 18788418 19039855 18771619"], "click_hist_str": ["18198449 18656271 18198449"]}}'
Here 9 is the id that is not present in the resmap.txt
Below is the error I get when doing a curl request -
{ "error": "indices[0] = 21748 is not in [0, 21663)\n\t [[{{node GatherV2_5}}]]" }
resmap.txt has 21663 key-values and num_oov_buckets is set to be 500.
Same input while predicting inside TF session gives correct result -
[[0.10621755 0.50749264 0.08582641 0.00173556]]
So clearly there is some problem with num_oov_buckets & graph having this are not correctly implemented in serving or if I am missing something/incorrectly building TF SavedModel then let me know.
UPDATE - Adding saved_model_cli show and run commands
saved_model_cli show --dir 1 --all
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['predict_ad_view_prob']:
The given SavedModel SignatureDef contains the following input(s):
inputs['click_hist_str'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: Placeholder_3:0
inputs['res_to_predict_str'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: Placeholder_5:0
The given SavedModel SignatureDef contains the following output(s):
outputs['probs'] tensor_info:
dtype: DT_DOUBLE
shape: (-1, -1)
name: Sigmoid:0
Method name is: tensorflow/serving/predict
saved_model_cli run --dir 1 --tag_set serve --signature_def predict_ad_view_prob --input_exprs 'click_hist_str=["50 50"];res_to_predict_str=["50 303960 1 2"]'
2019-07-18 10:18:54.805220: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-07-18 10:18:54.810121: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:18:54.811041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:1e.0
2019-07-18 10:18:54.811492: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-07-18 10:18:54.813643: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-07-18 10:18:54.815415: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-07-18 10:18:54.815914: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-07-18 10:18:54.818528: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-07-18 10:18:54.820856: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-07-18 10:18:54.826085: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-07-18 10:18:54.826234: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:18:54.827152: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:18:54.827807: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-07-18 10:18:54.828138: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-18 10:18:54.856561: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300065000 Hz
2019-07-18 10:18:54.857004: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5635e1749450 executing computations on platform Host. Devices:
2019-07-18 10:18:54.857037: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
2019-07-18 10:18:54.984822: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:18:54.985784: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5635e36188b0 executing computations on platform CUDA. Devices:
2019-07-18 10:18:54.985823: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Tesla K80, Compute Capability 3.7
2019-07-18 10:18:54.986072: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:18:54.987021: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:00:1e.0
2019-07-18 10:18:54.987099: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-07-18 10:18:54.987152: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-07-18 10:18:54.987202: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-07-18 10:18:54.987250: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-07-18 10:18:54.987300: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-07-18 10:18:54.987362: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-07-18 10:18:54.987413: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-07-18 10:18:54.987554: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:18:54.988526: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:18:54.989347: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-07-18 10:18:54.989418: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-07-18 10:18:54.995160: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-18 10:18:54.995475: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2019-07-18 10:18:54.995629: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2019-07-18 10:18:54.995938: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:18:54.996963: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-18 10:18:54.997884: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8895 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0, compute capability: 3.7)
WARNING: Logging before flag parsing goes to stderr.
W0718 10:18:54.999173 140274532570944 deprecation.py:323] From /home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/python/tools/saved_model_cli.py:339: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
W0718 10:18:55.271977 140274532570944 deprecation.py:323] From /home/ubuntu/anaconda3/lib/python3.7/site-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2019-07-18 10:18:56.953677: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
2019-07-18 10:18:56.979903: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
Result for output key probs:
[[0.14920072 0.07349582 0.12342736 0.12342736]]
saved_model_cli runit is working fine. SoSavedModelis correctly created but it doesn't work correctly with TF Serving. Anyways where should I share the model ? - Abhilash Awasthisaved_model_cli showandruncommands. - Abhilash Awasthi