CNN on Fashion MNIST dataset

Question

I am using the fashion MNIST dataset to try to work this out. I am using the data from the links:

Training : http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz

training set labels: http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz

test set images http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz

test set labels http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz

I use the code to open the dataset:

def load_mnist(path, kind='train'):
    import os
    import gzip
    import numpy as np

    """Load MNIST data from `path`"""
    labels_path = os.path.join(path,
                               '%s-labels-idx1-ubyte.gz'
                               % kind)
    images_path = os.path.join(path,
                               '%s-images-idx3-ubyte.gz'
                               % kind)

    with gzip.open(labels_path, 'rb') as lbpath:
        labels = np.frombuffer(lbpath.read(), dtype=np.uint8,
                               offset=8)

    with gzip.open(images_path, 'rb') as imgpath:
        images = np.frombuffer(imgpath.read(), dtype=np.uint8,
                               offset=16).reshape(len(labels), 784)

    return images, labels

label = ['T-shirt/top',  'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt',
         'Sneaker', 'Bag', 'Ankle boot']

data_dir = './'
X_train, y_train = load_mnist('D:\book', kind='train')
X_test, y_test = load_mnist('D:\book', kind='t10k')

X_train = X_train.astype(np.float32) / 256.0
X_test = X_test.astype(np.float32) / 256.0

I am trying to build a Convolutional Neural Network with the following architecture:

Convolutional Layer with 32 filters with size of 3x3
ReLU activation function
2x2 MaxPooling
Convolutional Layer with 64 filters with size of 3x3
ReLU activation function
2x2 MaxPooling
Fully connected layer with 512 units and ReLU activation function
Softmax activation layer for output layer For 100 epochs using the SGD optimizer

My Code is:

X_train = X_train.reshape([60000, 28, 28, 1])
X_train = X_train.astype('float32') / 255.0
X_test = X_test.reshape([10000, 28, 28, 1])
X_test = X_test.astype('float32') / 255.0
model = Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=[28,28,1]))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
model.summary()
y_train = keras.utils.np_utils.to_categorical(y_train)
y_test = keras.utils.np_utils.to_categorical(y_test)
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=100)

But it is taking a lot of time for execution. It is like 30 minutes per epoch. I think I am doing something wrong in my code. Can someone help me figure that out?

Anubhav Singh Anubhav Singh · Accepted Answer · 2019-07-28T05:51:08

Few points I want to highlight:

Look at these lines in your code after loading mnist dataset:
```
X_train = X_train.astype(np.float32) / 256.0
X_test = X_test.astype(np.float32) / 256.0 
```
Why are you dividing by 256.0 ? A pixel data in an image ranges from 0-255. So, you should divide it by 255.0 to normalize it to range 0-1.
After normalizing your data once after loading, you are normalizing it again. Check below code:
```
X_train = X_train.reshape([60000, 28, 28, 1])
X_train = X_train.astype('float32') / 255.0
X_test = X_test.reshape([10000, 28, 28, 1])
X_test = X_test.astype('float32') / 255.0
```
Here, after reshaping you are normalizing it again. There is no need of that. Normalizing data multiple times may slow down your convergence while training the network.
You are not passing batch_size value inside model.fit function. As per the documentation here,

If unspecified, batch_size will default to 32.

This might be the reason that it's taking more time for execution. Try to increase batch_size to 100, 200, etc. and then check execution time.
If might be possible that you are training your model on cpu, instead of gpu. 60000x28x28 training data is not a small dataset.

CNN on Fashion MNIST dataset

1 Answers