How to convert Tensorflow 2.0 SavedModel to TensorRT?

Question

I've trained a model in Tensorflow 2.0 and am trying to improve predict time when moving to production (on a server with GPU support). In Tensorflow 1.x I was able to get a predict speedup by using freeze graph, but this has been deprecated as of Tensorflow 2. From reading Nvidia's description of TensorRT, they suggest that using TensorRT can speedup inference by 7x compared to Tensorflow alone. Source:

TensorFlow 2.0 with Tighter TensorRT Integration Now Available

I have trained my model and saved it to a .h5 file using Tensorflow's SavedModel format. Now I follow nvidia's documentation to optimize the model for inference with tensorrt: TF-TRT 2.0 Workflow With A SavedModel.

When I run:

import tensorflow as tf
from tensorflow.python.compiler.tensorrt import trt_convert as trt

I get the error: ModuleNotFoundError: No module named 'tensorflow.python.compiler.tensorrt'

They give another example with Tensorflow 2.0 here: Examples. However, they try to import the same module as above and I get the same error.

Can anyone suggest how to optimize my model with TensorRT?

What version of TF are you using, and how did you build/install it? — Pooya Davoodi
@PooyaDavoodi - I've figured out my issue (see answer below). You had the right question, in that it was an issue with the TF version - thanks — maurera

maurera maurera · Accepted Answer · 2019-11-18T17:06:10

I've solved this issue. The problem is that I was testing the code on my local Windows machine, rather than on my AWS EC2 Instance with gpu support.

It seems that tensorflow.python.compiler.tensorrt is included in tensorflow-gpu, but not in standard tensorflow. In order to convert the SavedModel instance with TensorRT, you need to use a machine with tensorflow-gpu. (I knew that this would be required to run the model, but hadn't realized it was needed to convert the model.)

How to convert Tensorflow 2.0 SavedModel to TensorRT?

1 Answers