tensorflow serving: confusion on feature_configs data format

Question

I have followed the tensorflow serving tutorial mnist_saved_model.py and try to train and export a text-cnn-classifier model The pipeline is

*embedding layer -> cnn -> maxpool -> cnn -> dropout -> output layer

Tensorflow data input :

data_in = tf.placeholder(tf.int32,[None, sequence_length] , name='data_in')

transformed to

  serialized_tf_example = tf.placeholder(tf.string, name='tf_example')
  feature_configs = {'x': tf.FixedLenFeature(shape=[sequence_length], 
                     dtype=tf.int64),}
  tf_example = tf.parse_example(serialized_tf_example, feature_configs)
  # use tf.identity() to assign name
  data_in = tf.identity(tf_example['x'], name='x')

This works for training phase but at test time it tells AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Expects arg[0] to be int64 but string is provided")

I am confused about the above line

 feature_configs = {'x': tf.FixedLenFeature(shape=[sequence_length], 
                    dtype=tf.int64),}

I changed the line to

  feature_configs = {'x': tf.FixedLenFeature(shape=[sequence_length], 
                     dtype=tf.string),}

but it gives the following error at training time:

Traceback (most recent call last):
  File "/serving/bazel-bin/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.runfiles/tf_serving/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.py", line 222, in <module>
    embedded_chars = tf.nn.embedding_lookup(W, data_in)
  File "/serving/bazel-bin/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.runfiles/org_tensorflow/tensorflow/python/ops/embedding_ops.py", line 122, in embedding_lookup
    return maybe_normalize(_do_gather(params[0], ids, name=name))
  File "/serving/bazel-bin/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.runfiles/org_tensorflow/tensorflow/python/ops/embedding_ops.py", line 42, in _do_gather
    return array_ops.gather(params, ids, name=name)
  File "/serving/bazel-bin/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.runfiles/org_tensorflow/tensorflow/python/ops/gen_array_ops.py", line 1179, in gather
    validate_indices=validate_indices, name=name)
  File "/serving/bazel-bin/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.runfiles/org_tensorflow/tensorflow/python/framework/op_def_library.py", line 589, in apply_op
    param_name=input_name)
  File "/serving/bazel-bin/tensorflow_serving/example/twitter-sentiment-cnn_saved_model.runfiles/org_tensorflow/tensorflow/python/framework/op_def_library.py", line 60, in _SatisfiesTypeConstraint
    ", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
TypeError: Value passed to parameter 'indices' has DataType string not in list of allowed values: int32, int64

Feixiang.Wen Feixiang.Wen · Accepted Answer · 2017-09-01T09:23:34

Your code is wrong:

serialized_tf_example = tf.placeholder(tf.string, name='tf_example')

that means your input is a string, such as sentence's word. Therefore:

feature_configs = {'x': tf.FixedLenFeature(shape=[sequence_length], 
                   dtype=tf.int64),}
tf_example = tf.parse_example(serialized_tf_example, feature_configs)

That means nothing, in my opinion, because you do not though vocabulary transfer string to int. You need load your train data's vocab to get word index!

tensorflow serving: confusion on feature_configs data format

1 Answers