1
votes

I am trying to train the mnist dataset on ResNet50 using the Keras library. The shape of mnist is (28, 28, 1) however resnet50 required the shape to be (32, 32, 3)

How can I convert the mnist dataset to the required shape?

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape(x_train.shape[0], x_train.shape[1], x_train.shape[2], 1)
x_test = x_test.reshape(x_test.shape[0], x_test.shape[1], x_test.shape[2], 1)
x_train = x_train/255.0
x_test = x_test/255.0
from keras.utils import to_categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
model = models.Sequential()
# model.add(InputLayer(input_shape=(28, 28)))
# model.add(Reshape(target_shape=(32, 32, 3)))
# model.add(Conv2D())
model.add(conv_base)
model.add(Flatten())
model.add(BatchNormalization())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(BatchNormalization())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(BatchNormalization())
model.add(Dense(10, activation='softmax'))

model.compile(optimizer=optimizers.RMSprop(lr=2e-5), loss='binary_crossentropy', metrics=['acc'])

history = model.fit(x_train, y_train, epochs=5, batch_size=20, validation_data=(x_test, y_test))
ValueError: Input 0 is incompatible with layer sequential_10: expected shape=(None, 32, 32, 3), found shape=(20, 28, 28, 1) 
1
ResNet50 requires RGB images whereas MNIST is grayscale, you should convert the data to RGB first. - Frightera
Note that reshape and resize are NOT the same thing, you need to resize images, not reshape them (as your title suggests). - Dr. Snoopy

1 Answers

3
votes

You need to resize the MNIST data set. Note that minimum size actually depends on the ImageNet model. For example: Xception requires at least 72, where ResNet is asking for 32. Apart from that, the MNIST is a grayscale image, but it may conflict if you're using the pretrained weight of these models. So, good and safe side is to resize and convert grayscale to RGB.


Full working code for you.

Data Set

We will resize MNIST from 28 to 32. Also, make 3 channels instead of keeping 1.

import tensorflow as tf 
import numpy as np 

(x_train, y_train), (_, _) = tf.keras.datasets.mnist.load_data()

# expand new axis, channel axis 
x_train = np.expand_dims(x_train, axis=-1)

# [optional]: we may need 3 channel (instead of 1)
x_train = np.repeat(x_train, 3, axis=-1)

# it's always better to normalize 
x_train = x_train.astype('float32') / 255

# resize the input shape , i.e. old shape: 28, new shape: 32
x_train = tf.image.resize(x_train, [32,32]) # if we want to resize 

# one hot 
y_train = tf.keras.utils.to_categorical(y_train , num_classes=10)

print(x_train.shape, y_train.shape)
(60000, 32, 32, 3) (60000, 10)

ResNet 50

input = tf.keras.Input(shape=(32,32,3))
efnet = tf.keras.applications.ResNet50(weights='imagenet',
                                             include_top = False, 
                                             input_tensor = input)
# Now that we apply global max pooling.
gap = tf.keras.layers.GlobalMaxPooling2D()(efnet.output)

# Finally, we add a classification layer.
output = tf.keras.layers.Dense(10, activation='softmax', use_bias=True)(gap)

# bind all
func_model = tf.keras.Model(efnet.input, output)

Train

func_model.compile(
          loss  = tf.keras.losses.CategoricalCrossentropy(),
          metrics = tf.keras.metrics.CategoricalAccuracy(),
          optimizer = tf.keras.optimizers.Adam())
# fit 
func_model.fit(x_train, y_train, batch_size=128, epochs=5, verbose = 2)
Epoch 1/5
469/469 - 56s - loss: 0.1184 - categorical_accuracy: 0.9690
Epoch 2/5
469/469 - 21s - loss: 0.0648 - categorical_accuracy: 0.9844
Epoch 3/5
469/469 - 21s - loss: 0.0503 - categorical_accuracy: 0.9867
Epoch 4/5
469/469 - 21s - loss: 0.0416 - categorical_accuracy: 0.9888
Epoch 5/5
469/469 - 21s - loss: 0.1556 - categorical_accuracy: 0.9697
<tensorflow.python.keras.callbacks.History at 0x7f316005a3d0>