0
votes

I was trying to run a vanilla Image net classification with VGG16 network in tensorflow (which gives out VGG16 through Keras backbone).

However when I tried to run classification on a sample elephant image it is giving completely unexpected results.

I am not able to figure out what might be the issue.

Here is the complete code I used:

import tensorflow as tf
import numpy as np
from PIL import Image
from tensorflow.python.keras._impl.keras.applications import imagenet_utils


model = tf.keras.applications.VGG16()
VGG = model.graph

VGG.get_operations()
input = VGG.get_tensor_by_name("input_1:0")
output = VGG.get_tensor_by_name("predictions/Softmax:0")
print(input)
print(output)

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array = np.expand_dims(image_array, axis=0)


with tf.Session(graph=VGG) as sess:
    init_op = tf.global_variables_initializer()
    sess.run(init_op)
    pred = (sess.run(output,{input:image_array}))
    print(imagenet_utils.decode_predictions(pred))

The below is the sample output I get:

Tensor("input_1:0", shape=(?, 224, 224, 3), dtype=float32)
Tensor("predictions/Softmax:0", shape=(?, 1000), dtype=float32)

[[('n02281406', 'sulphur_butterfly', 0.0022673723), ('n01882714', 'koala', 0.0021256246), ('n04325704', 'stole', 0.0020583202), ('n01496331', 'electric_ray', 0.0020416214), ('n01797886', 'ruffed_grouse', 0.0020229272)]]

From the probablities it loooks like there is something wrong with the passed Image data (as all are very low).

But I couldn't figure out what is wrong.
And I am very sure the image is of an elephant as a human!

2

2 Answers

0
votes

I think there is 2 mistakes, the first one is that you must rescale your image by dividing by 255 all pixels.

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array /= 255.
image_array = np.expand_dims(image_array, axis=0)

The second point I got it while looking at prediction values. You have a vector of 1000 element and all of them have 0.1% prediction after rescale. That means you have a non-trained model. I don't know exactly how to have if loaded in tensorflow but on Keras for example you can do :

app = applications.vgg16
model = app.VGG16(
        include_top=False,    # this is to have the classifier Standard from imagenet
        weights='imagenet',   # this load weight, else it's random weight
        pooling="avg") 

From what I've read, you have to download another file containing weight from for example github.

I hope it helps,

EDIT1:

I tried the same model usign Keras :

from keras.applications.vgg16 import VGG16, decode_predictions
import numpy as np

model = VGG16(weights='imagenet')

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array = image_array/255.
x = np.expand_dims(image_array, axis=0)

preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=5)[0])

If I comment the rescaling, I have bad predictions :

Predicted: [('n03788365', 'mosquito_net', 0.22725257), ('n15075141', 'toilet_tissue', 0.026636025), ('n04209239', 'shower_curtain', 0.019786758), ('n02804414', 'bassinet', 0.01353887), ('n03131574', 'crib', 0.01316699)]

Without the rescale, this is good :

Predicted: [('n02504458', 'African_elephant', 0.95870858), ('n01871265', 'tusker', 0.040065952), ('n02504013', 'Indian_elephant', 0.0012253703), ('n01704323', 'triceratops', 5.0949382e-08), ('n02454379', 'armadillo', 5.0408511e-10)]

Now if I remove the weight, I have the "same" as what I have with Tensorflow:

Predicted: [('n07717410', 'acorn_squash', 0.0010033853), ('n02980441', 'castle', 0.0010028203), ('n02124075', 'Egyptian_cat', 0.0010028186), ('n04179913', 'sewing_machine', 0.0010027955), ('n02492660', 'howler_monkey', 0.0010027081)]

To me, that means that you have no weight applied. Maybe they are downloaded but not used.

0
votes

It seems that we can (or need to?) use the session from Keras (which has the associated loaded graph with weights) instead of creating a new session in Tensorflow and using the graph obtained from Keras model like below

VGG = model.graph  

I think the graph got above has no weights in it (and that is the reason predictions are wrong) and the graph from Keras session as the proper weights (so those two graph instances should be different)

Below is the full code:

import tensorflow as tf
import numpy as np
from PIL import Image
from tensorflow.python.keras._impl.keras.applications import imagenet_utils
from tensorflow.python.keras._impl.keras import backend as K


model = tf.keras.applications.VGG16()
sess = K.get_session()
VGG = model.graph #Not needed and also doesnt have weights in it

VGG.get_operations()
input = VGG.get_tensor_by_name("input_1:0")
output = VGG.get_tensor_by_name("predictions/Softmax:0")
print(input)
print(output)

I = Image.open("Elephant.jpg")
new_img = I.resize((224,224))
image_array = np.array(new_img)[:, :, 0:3]
image_array = np.expand_dims(image_array, axis=0)
image_array = image_array.astype(np.float32)
image_array = tf.keras.applications.vgg16.preprocess_input(image_array)

pred = (sess.run(output,{input:image_array}))
print(imagenet_utils.decode_predictions(pred))

And this gives the expected result:

[[('n02504458', 'African_elephant', 0.8518132), ('n01871265', 'tusker', 0.1398836), ('n02504013', 'Indian_elephant', 0.0082286), ('n01704323', 'triceratops', 6.965483e-05), ('n02397096', 'warthog', 1.8662439e-06)]]

Thanks Idavid for the tip about using preprocess_input() function and Nicolas for tips about unloaded weights.