I trained the face recognition model with the quantization-aware training method of tensorflow version 1.12.0. The network uses inception-resnet_v1(The source of the code is tensorflow/models/research/slim/nets/). After the training is completed, I get ckpt, then I create a new freeze.py file to generate eval.pb, and then successfully generate the tflite model with toco. But when I finally tested the tflite model with image, I got the following error:
File "src/test_tflite.py", line 21, in <module>
Interpreter.allocate_tensors()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/lite/python/interpreter.py", line 71, in allocate_tensors
Return self._interpreter.AllocateTensors()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/lite/python/interpreter_wrapper/tensorflow_wrap_interpreter_wrapper.py", line 106, in AllocateTensors
Return _tensorflow_wrap_interpreter_wrapper.InterpreterWrapper_AllocateTensors(self)
RuntimeError: tensorflow/contrib/lite/kernels/pooling.cc:103 input->params.scale != output->params.scale (102483008 != 102482528)Node number 116 (MAX_POOL_2D) failed to prepare.
I tried to replace the network, inception-v3, inception-resnet-v2, but all got a similar error.
My training code is based on the facenet framework and I made small changes based on the original training. After defining total_loss_op, add the following two lines of code:
train_graph = tf.get_default_graph()
tf.contrib.quantize.create_training_graph(input_graph=train_graph, quant_delay=20000)
In the freeze.py file, when the inference graph is defined, I add the following code:
g = tf.get_default_graph()
tf.contrib.quantize.create_eval_graph(input_graph=g)
Then load the ckpt that was trained before, and finally save it as a pb file. The code is as follows:
saver = tf.train.Saver(tf.global_variables())
sess = tf.Session()
with sess.as_default():
saver.restore(sess, ckpt_model_path)
frozen_graph_def = graph_util.convert_variables_to_constants(
sess, sess.graph_def, ['embeddings'])
tf.train.write_graph(
frozen_graph_def,
os.path.dirname(save_pb_path),
os.path.basename(save_pb_path),
as_text=False)
Then I used the tensorflow1.12.0 toco tool to convert the pb file and successfully generated tflite. The specific commands are as follows:
./bazel-bin/tensorflow/contrib/lite/toco/toco \
--input_file=inception_resnet_v1_fake_quantized_eval.pb \
--output_file=tflite_model.tflite \
--input_format=TENSORFLOW_GRAPHDEF \
--output_format=TFLITE \
--inference_type=QUANTIZED_UINT8 \
--input_shape="1,160,160,3" \
--input_array=input \
--output_array=embeddings \
--std_value=127.5 \
--mean_value=127.5 \
--default_ranges_min=-1.0 \
--default_ranges_max=1.0
Finally, I used the generated tflite model to test the image and I got the following error.
RuntimeError: tensorflow/contrib/lite/kernels/pooling.cc:103 input->params.scale != output->params.scale (102483008 != 102482528)Node number 116 (MAX_POOL_2D) failed to prepare.
My test code is as follows:
import numpy as np
import tensorflow as tf
import scipy
#Load TFLite model and allocate tensors.
interpreter = tf.contrib.lite.Interpreter(model_path="tensorflow-1.12.0/tflite_model.tflite")
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
image = scipy.misc.imread("src/1511.jpg")
image_ = np.array([image.astype('uint8')])
print(image_.shape)
print(type(image_))
print(input_details)
print(output_details)
interpreter.set_tensor(input_details[0]['index'], image_)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)