I am finding it hard to understand how flow_from_directory of ImageDataGenerator works, I am using the following code to augment image data for my CNN model, as no of training images available are very less.
batch_size = 16
from keras.preprocessing.image import ImageDataGenerator
train_transformed = 'dataset/train_transformed'
train_datagen = ImageDataGenerator(
rescale=1./255,
horizontal_flip=True,
fill_mode='nearest')
train_generator = train_datagen.flow_from_directory(
'dataset/train',
target_size=(150, 150),
batch_size=batch_size,
class_mode='binary',
save_to_dir=train_transformed,
save_prefix='train_aug',
save_format='png')
Its a binary classification problem having 20 positive and 20 negative images. So i have dataset/train folder with 2 subfolders having 20 images each. When i train the model with above image generator, i can see 4160 images being saved in dataset/train_transformed folder and presuming 4160 images being used for training the model.
model.fit_generator(
train_generator,
steps_per_epoch=1000 // batch_size,
epochs=5,
validation_data=validation_generator,
validation_steps=100 // batch_size)
According to my understanding,
No. of samples in each epoch = batch_size X steps_per_epoch
As my steps_per_epoch = 1000/16 = 62,
#Samples in each epoch should be 62 x 16 = 992
No of epochs is set to 5, so total generated images should be 992 x 5 = 4960.
And no of images generated are random with same hyperparameters.
Just needed an explanation for above configuration.