I would like to include my custom pre-processing logic in my exported Keras model for use in Tensorflow Serving.
My pre-processing performs string tokenization and uses an external dictionary to convert each token to an index for input to the Embedding layer:
from keras.preprocessing import sequence
token_to_idx_dict = ... #read from file
# Custom Pythonic pre-processing steps on input_data
tokens = [tokenize(s) for s in input_data]
token_idxs = [[token_to_idx_dict[t] for t in ts] for ts in tokens]
tokens_padded = sequence.pad_sequences(token_idxs, maxlen=maxlen)
Model architecture and training:
model = Sequential()
model.add(Embedding(max_features, 128, input_length=maxlen))
model.add(LSTM(128, activation='sigmoid'))
model.add(Dense(n_classes, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')
model.fit(x_train, y_train)
Since the model will be used in Tensorflow Serving, I want to incorporate all pre-processing logic into the model itself (encoded in the exported model file).
Q: How can I do so using the Keras library only?
I found this guide explains how to combine Keras and Tensorflow. But I'm still unsure how to export everything as one model.
I know Tensorflow has built-in string splitting, file I/O, and dictionary lookup operations.
Pre-processing logic using Tensorflow operations:
# Get input text
input_string_tensor = tf.placeholder(tf.string, shape={1})
# Split input text by whitespace
splitted_string = tf.string_split(input_string_tensor, " ")
# Read index lookup dictionary
token_to_idx_dict = tf.contrib.lookup.HashTable(tf.contrib.lookup.TextFileInitializer("vocab.txt", tf.string, 0, tf.int64, 1, delimiter=","), -1)
# Convert tokens to indexes
token_idxs = token_to_idx_dict.lookup(splitted_string)
# Pad zeros to fixed length
token_idxs_padded = tf.pad(token_idxs, ...)
Q: How can I use these Tensorflow pre-defined pre-processing operations and my Keras layers together to both train and then export the model as a "black box" for use in Tensorflow Serving?