Tensorflow slim inception resnet v2 inference

Question

https://github.com/tensorflow/models/blob/master/research/slim/nets/inception_resnet_v2.py

How do you do inference on this model properly? I am interested in setting this up so that a user can do inference on individual images that he inputs into the command line one-by-one. In order for this to be fast, the model must be loaded ONCE and input images must be hot-swappable as the user inputs them into the command line.

You can do non-hot-swappable inference if you use a similar structure to the evaluation code for this model: https://github.com/tensorflow/models/blob/master/research/slim/eval_image_classifier.py

You can slightly modify the above file to print your logits and do inference. However, this solution rebuilds the graph each time which is really slow.

I tried building the graph and passing in a feed_dict into fifo_queue_Dequeue:0 tensor, which represents the batched input. However, the session would hang and never compute. I believe it's because the graph is 'frozen' - the tensors cannot take in new input. But now I'm stumped as to how to get the behavior I want.

vijay m vijay m · Accepted Answer · 2018-05-29T19:33:42

Inference steps given below:

Creating the Inception-resnet-v2 graph

import sys
# import from tensorflow models
sys.path.append('/home/vijay/workspace/learning/tensorflow/')

#Load the definitions of Inception-Resnet-v2 architecture
import tensorflow.contrib.slim as slim
from models.research.slim.nets.inception_resnet_v2 import inception_resnet_v2, inception_resnet_v2_arg_scope


#The pretrained model accepts size of 299x299 images
HEIGHT = 299
WIDTH = 299
CHANNELS = 3

# Create Graph

graph = tf.Graph()
with graph.as_default():

   # Create a placeholder to pass the input image
   img_tensor = tf.placeholder(tf.float32, shape=(None, HEIGHT, WIDTH, CHANNELS))

   # Scale the image inputs to {+1, -1} from 0 to 255
   img_scaled = tf.scalar_mul((1.0/255), img_tensor)
   img_scaled = tf.subtract(img_scaled, 0.5)
   img_scaled = tf.multiply(img_scaled, 2.0)

   # load Graph definitions
   with slim.arg_scope(inception_resnet_v2_arg_scope()):
      logits, end_points = inception_resnet_v2(img_scaled, is_training=False)

   # predict the class
   predictions = end_points['Predictions']

Load the test images (example from here):

#Loading a test image 
img = cv2.imread('/home/vijay/datasets/image/misc/Bernese-Mountain- Dog.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (WIDTH, HEIGHT))

# make the input size [BATCH, WIDTH, HEIGHT, CHANNELS] for the network
img = np.expand_dims(img, axis=0)

Load weights and run the graph

#for labels of imagenet 
sys.path.append('/home/vijay/workspace/learning/tensorflow/models/research/slim')
from datasets import imagenet

# Inception resnet v2 model 
checkpoint_file='/home/vijay/datasets/pre_trained_models/inception_resnet_v2_2016_08_30.ckpt'

with tf.Session(graph=train_graph) as sess:

   saver = tf.train.Saver()
   saver.restore(sess, checkpoint_file)

   pred_prob= sess.run(predictions, feed_dict={img_tensor:img})

   # Getting the top 5 classes of the imagenet database
   probabilities = pred_prob[0, 0:]
   sorted_inds = [i[0] for i in sorted(enumerate(-probabilities), key=lambda x:x[1])]

   names = imagenet.create_readable_names_for_imagenet_labels()
   for i in range(5):
       index = sorted_inds[i]
       print('Probability %0.2f%% => [%s]' % (probabilities[index], names[index]))

Output

Probability 0.84% => [Bernese mountain dog]
Probability 0.04% => [Appenzeller]
Probability 0.03% => [EntleBucher]
Probability 0.01% => [Greater Swiss Mountain dog]
Probability 0.00% => [Border collie]