Fit generator and data augmentation in keras

Question

I have a test dataset of 5 samples and a train dataset of 2000 samples. I would like to augment my datasets and I am following the example provided by keras

datagen_test = ImageDataGenerator(
                featurewise_center=True,
                featurewise_std_normalization=True,
                rotation_range=20,
                width_shift_range=0.2,
                height_shift_range=0.2,
                horizontal_flip=True
                )
datagen_train = ImageDataGenerator(
                featurewise_center=True,
                featurewise_std_normalization=True,
                rotation_range=20,
                width_shift_range=0.2,
                height_shift_range=0.2,
                horizontal_flip=True
                )
datagen_train.fit(x_train)
validation_generator = datagen_test.flow(x_test, y_test, batch_size=5)


model.compile(loss=keras.losses.categorical_crossentropy,
          optimizer='rmsprop',
          metrics=['accuracy'])
# fits the model on batches with real-time data augmentation:
model.fit_generator(datagen_train.flow(x_train, y_train, batch_size=50),
                steps_per_epoch=len(x_train) / 10, epochs=epochs, 
                validation_data=validation_generator, validation_steps=800)

What I believe is that the steps_per_epoch parameter is the number of batches passed to the classifier. I set the batch_size in my generator to be 50, however I have only 5 samples. I think my questions has nothing to do with the samples_per_epoch which is the the number of samples processed in one epoch.

My question is: Will the generator transform my images in order to create 50 different samples and pass them to the classifier or will transform only 5?

Possible duplicate of Augementations in Keras ImageDataGenerator — Wilmar van Ommeren
Check this: When the data set size is not a multiple of the mini-batch size, should the last mini-batch be smaller, or contain samples from other batches? — Matin
I don't believe setting your batch size to 50 will cause the generator to create 50 images by transforming your original 5. Aside from this, it may be worth checking out the video below on data augmentation in Keras, as I see the numbers you're using in steps_per_epoch and validation_steps don't appear to match the convention. youtu.be/1WVbqNbWCjk — blackHoleDetector

Marcin Możejko Marcin Możejko · Accepted Answer · 2017-10-17T18:17:24

Unfortunately - when you set the batch_size to 50 when you have only 5 examples will make your generator to return only 5 examples in each batch (despite the batch_size). So it will not extend your batch to 50.

Fit generator and data augmentation in keras

1 Answers