1
votes

I'm trying to get the probability per each class out of the keras model. Please find sample keras model below:

width = 80
height = 80
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=( width, height, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())  # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(64))
model.add(Activation('relu'))
#model.add(Dropout(0.5))
model.add(Dense(2))
model.add(Activation('softmax'))

model.compile(loss='sparse_categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

However, after the model is trained, and I load an image to be predicted via:

img = image.load_img('Test2.jpg', target_size=(80, 80))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
images = np.vstack([x])
classes = model.predict_proba(images, batch_size=1)
print(classes)

[[ 0.  1.]]

I still get the classes labels, rather than probabilities. Any hints what am I doing wrong?

EDIT This is how the model is trained:

train_datagen = ImageDataGenerator(
        rotation_range=40,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        fill_mode='nearest')

# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(
        rotation_range=40,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        fill_mode='nearest')


train_generator = train_datagen.flow_from_directory(
        '.\\train',  # this is the target directory
        target_size=(width, height),  # all images will be resized to 150x150
        batch_size=batch_size,
        class_mode='binary',
        shuffle=True)  # since we use binary_crossentropy loss, we need binary labels

# this is a similar generator, for validation data
validation_generator = test_datagen.flow_from_directory(
        '.\\validate',
        target_size=(width, height),
        batch_size=batch_size,
        class_mode='binary',
        shuffle=True)

model.fit_generator(
        train_generator,
        steps_per_epoch=4000,
        epochs=2,
        validation_data=validation_generator,
        validation_steps=1600)
1
Have you normalized your training data? Have you normalized your input image accordingly? - Marcin Możejko
0 and 1 are also valid probabilities. - Dr. Snoopy
Hey guys, thanks for prompt reactions. I updated the question with the code which trains the model / loads the samples. I don't think I'm doing normalization anywhere, or am I wrong? (keras newbie). It looks rather odd to me that the prob of either of the classes would be that high, right? However will try for different samples and let you know. - TechCrap

1 Answers

1
votes

The problem is that you are using the 'sparse_categorical_crossentropy' loss with class_mode='binary' in your ImageDataGenerator.

You have two possibilities here:

  1. Change the loss to 'categorical_crossentropy' and set class_mode='categorical'.
  2. Leave the loss as is but set class_mode='sparse'.

Either will work.

Refer to this answer for the difference between the two losses (in Tensorflow, but it holds for Keras too). The short version is that the sparse loss expects labels to be integer classes (e.g. 1, 2, 3...), whereas the normal one wants one-hot encoded vectors (e.g. [0, 1, 0, 0]).

Cheers

EDIT: as @Simeon Kredatus pointed out, it was a normalization issue. This can be easily solved by setting the appropriate flags in the ImageDataGenerator constructors for both training and test sets, namely samplewise_center=True and samplewise_std_normalization=True.
Updating the answer so people can see the solution. In general, remember the trash-in-trash-out principle.