I am training a Convolutional Neural Network using face images dataset. The dataset has 10,000 images of dimensions 700 x 700. My model has 12 layers. I am using a generator function to read images into Keras fit_generator function as below.
train_file_names ==> Python list containing filenames of training instances
train_class_labels ==> Numpy array of one-hot encoded class lables ([0, 1, 0], [0, 0, 1] etc.)
train_data ==> Numpy array of training instances
train_steps_epoch ==> 16 (Batch size is 400 and I have 6400 instances for training. Hence it takes 16 iterations for a single pass through the whole dataset)
batch_size ==> 400
calls_made ==> When generator reaches end of training instances, it resets indexes to load data from first index in next epoch.
I am passing this generator as an argument to keras 'fit_generator' function to generate new batch of data for each epoch.
val_data, val_class_labels ==> Validation data numpy arrays
epochs ==> No. of epochs
Using Keras fit_generator :
model.fit_generator(generator=train_generator, steps_per_epoch=train_steps_per_epoch, epochs=epochs, use_multiprocessing=False, validation_data=[val_data, val_class_labels], verbose=True, callbacks=[history, model_checkpoint], shuffle=True, initial_epoch=0)
Code
def train_data_generator(self):
index_start = index_end = 0
temp = 0
calls_made = 0
while temp < train_steps_per_epoch:
index_end = index_start + batch_size
for temp1 in range(index_start, index_end):
index = 0
# Read image
img = cv2.imread(str(TRAIN_DIR / train_file_names[temp1]), cv2.IMREAD_GRAYSCALE).T
train_data[index] = cv2.resize(img, (self.ROWS, self.COLS), interpolation=cv2.INTER_CUBIC)
index += 1
yield train_data, self.train_class_labels[index_start:index_end]
calls_made += 1
if calls_made == train_steps_per_epoch:
index_start = 0
temp = 0
calls_made = 0
else:
index_start = index_end
temp += 1
gc.collect()
Output of fit_generator
Epoch 86/300
16/16 [==============================] - 16s 1s/step - loss: 1.5739 - acc: 0.2991 - val_loss: 12.0076 - val_acc: 0.2110
Epoch 87/300
16/16 [==============================] - 16s 1s/step - loss: 1.6010 - acc: 0.2549 - val_loss: 11.6689 - val_acc: 0.2016
Epoch 88/300
16/16 [==============================] - 16s 1s/step - loss: 1.5750 - acc: 0.2391 - val_loss: 10.2663 - val_acc: 0.2004
Epoch 89/300
16/16 [==============================] - 16s 1s/step - loss: 1.5526 - acc: 0.2641 - val_loss: 11.8809 - val_acc: 0.2249
Epoch 90/300
16/16 [==============================] - 16s 1s/step - loss: 1.5867 - acc: 0.2602 - val_loss: 12.0392 - val_acc: 0.2010
Epoch 91/300
16/16 [==============================] - 16s 1s/step - loss: 1.5524 - acc: 0.2609 - val_loss: 12.0254 - val_acc: 0.2027
My problem is, while using 'fit_generator' with above generator function as above, my model loss is not at all improving and validation accuracy is very poor. But when I use keras 'fit' function as below, the model loss decreases and validation accuracy is far better.
Using Keras fit function without using a generator
model.fit(self.train_data, self.train_class_labels, batch_size=self.batch_size, epochs=self.epochs, validation_data=[self.val_data, self.val_class_labels], verbose=True, callbacks=[history, model_checkpoint])
Output when trained using fit function
Epoch 25/300
6400/6400 [==============================] - 20s 3ms/step - loss: 0.0207 - acc: 0.9939 - val_loss: 4.1009 - val_acc: 0.4916
Epoch 26/300
6400/6400 [==============================] - 20s 3ms/step - loss: 0.0197 - acc: 0.9948 - val_loss: 2.4758 - val_acc: 0.5568
Epoch 27/300
6400/6400 [==============================] - 20s 3ms/step - loss: 0.0689 - acc: 0.9800 - val_loss: 1.2843 - val_acc: 0.7361
Epoch 28/300
6400/6400 [==============================] - 20s 3ms/step - loss: 0.0207 - acc: 0.9947 - val_loss: 5.6979 - val_acc: 0.4560
Epoch 29/300
6400/6400 [==============================] - 20s 3ms/step - loss: 0.0353 - acc: 0.9908 - val_loss: 1.0801 - val_acc: 0.7817
Epoch 30/300
6400/6400 [==============================] - 20s 3ms/step - loss: 0.0362 - acc: 0.9896 - val_loss: 3.7851 - val_acc: 0.5173
Epoch 31/300
6400/6400 [==============================] - 20s 3ms/step - loss: 0.0481 - acc: 0.9896 - val_loss: 1.1152 - val_acc: 0.7795
Epoch 32/300
6400/6400 [==============================] - 20s 3ms/step - loss: 0.0106 - acc: 0.9969 - val_loss: 1.4803 - val_acc: 0.7372
img = img / 255.0
to make sure it learns. Otherwise the numbers are too big for anything to happen with the default learning rate. – sachinruk