0
votes

I'm trying to train a model. I have almost 150 classes and I'm using ImageDataGenerator to augment my dataset. I'm also using model checkpoints and csvlogger to save the weights. It gives me an error at a certain point in the first epoch when I start my training. The images I'm using are grayscale images if that helps.

here is my code:

batch_size = 2000
epochs = 10

    # Augments dataset 10x
train_batches = ImageDataGenerator(preprocessing_function=preprocess_func, horizontal_flip=True, width_shift_range=0.1, height_shift_range=0.1, shear_range=0.2, zoom_range=0.2, fill_mode='nearest') \
    .flow_from_directory(directory=train_path, target_size=image_size, classes=dataset_classes, batch_size=5, color_mode='grayscale')
valid_batches = ImageDataGenerator(preprocessing_function=preprocess_func, horizontal_flip=True, width_shift_range=0.15, height_shift_range=0.1, shear_range=0.2, zoom_range=0.2, fill_mode='nearest') \
    .flow_from_directory(directory=valid_path, target_size=image_size, classes=dataset_classes, batch_size=5, color_mode='grayscale')
test_batches = ImageDataGenerator(preprocessing_function=preprocess_func, horizontal_flip=True, width_shift_range=0.15, height_shift_range=0.1, shear_range=0.2, zoom_range=0.2, fill_mode='nearest') \
    .flow_from_directory(directory=test_path, target_size=image_size, classes=dataset_classes, batch_size=5, color_mode='grayscale')

here is my callback:

    from keras.callbacks import ModelCheckpoint, CSVLogger

checkpoint_path = "/content/drive/MyDrive/Colab Notebooks/Datasets/Experiment/weights_improvements-epoch:{epoch:02d}-val_accuracy:{val_accuracy:.2f}.hdf5"
checkpoint_dir = os.path.dirname(checkpoint_path)

# Create a callback that saves the model's weights
cp_callback = ModelCheckpoint(checkpoint_path,
                              verbose=1,
                              monitor='val_accuracy',
                              mode='max',
                              save_best_only=True,
                              save_weights_only=True)

log_folder = '/content/drive/MyDrive/Colab Notebooks/Datasets/Experiment'
log_path = os.path.join(log_folder, 'FSLR_logs.csv')
log_csv = CSVLogger(log_path, separator=',', append=False)

callback_list = [cp_callback, log_csv]

Fitting the model:

# Compile the layers into one model and create a connection
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(), metrics=['accuracy'])

# Train the model with the new callback
history = model.fit(x=train_batches,
                    validation_data=valid_batches,
                    batch_size=batch_size,
                    epochs=epochs,
                    callbacks=callback_list)

The error I'm receiving is this:

Epoch 1/10 3428/4128 [=======================>......] - ETA: 26:10 - loss: 4.8299 - accuracy: 0.0078 --------------------------------------------------------------------------- UnknownError Traceback (most recent call last) in () 4 batch_size=batch_size, 5 epochs=epochs, ----> 6 callbacks=callback_list)

6 frames /usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name) 58 ctx.ensure_initialized() 59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, ---> 60 inputs, attrs, num_outputs) 61 except core._NotOkStatusException as e: 62 if name is not None:

UnknownError: OSError: image file is truncated (30 bytes not processed) Traceback (most recent call last):

File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/script_ops.py", line 249, in call ret = func(*args)

File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/autograph/impl/api.py", line 645, in wrapper return func(*args, **kwargs)

File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/dataset_ops.py", line 892, in generator_py_func values = next(generator_state.get_iterator(iterator_id))

File "/usr/local/lib/python3.7/dist-packages/keras/engine/data_adapter.py", line 822, in wrapped_generator for data in generator_fn():

File "/usr/local/lib/python3.7/dist-packages/keras/engine/data_adapter.py", line 948, in generator_fn yield x[i]

File "/usr/local/lib/python3.7/dist-packages/keras_preprocessing/image/iterator.py", line 65, in getitem return self._get_batches_of_transformed_samples(index_array)

File "/usr/local/lib/python3.7/dist-packages/keras_preprocessing/image/iterator.py", line 230, in _get_batches_of_transformed_samples interpolation=self.interpolation)

File "/usr/local/lib/python3.7/dist-packages/keras_preprocessing/image/utils.py", line 138, in load_img img = img.resize(width_height_tuple, resample)

File "/usr/local/lib/python3.7/dist-packages/PIL/Image.py", line 1886, in resize self.load()

File "/usr/local/lib/python3.7/dist-packages/PIL/ImageFile.py", line 247, in load "(%d bytes not processed)" % len(b)

OSError: image file is truncated (30 bytes not processed)

[[{{node PyFunc}}]] [[IteratorGetNext]] [Op:__inference_train_function_1029]

Function call stack: train_function

I've tried to use the same code in training two classes and it works fine. I don't know why it is not working when I use it on all of my 140+ classes.

Can someone explain to me the problem? I kinda need this for my school project. Thank you in advance!