Trying to train a CNN with a bunch of images using a DataGenerator class, model works perfectly fine normally. The problem is the training dataset is very skewed to a few classes so I want to add class_weights. However, every time I do this I get an index error in the part of the code that converts my labelled classes into one-hot arrays.
This if for Keras running on top of tensorflow. The function that is having the problem is keras.utils.to_categorical()
Here's the to catagorical function:
for i, pdb_id in enumerate(list_enzymes_temp):
mat = precomputed_distance_matrix(pdb_id, self.dim)
X[i,] = mat.distance_matrix.reshape(*self.dim)
y[i] = int(self.labels[pdb_id.upper()][1]) - 1
return X, keras.utils.to_categorical(y, num_classes=self.n_classes)
Here's the function I am using to generate the weights
def get_class_weights(dictionary, training_enzymes, mode):
'Gets class weights for Keras'
# Initialization
counter = [0 for i in range(6)]
# Count classes
for enzyme in training_enzymes:
counter[int(dictionary[enzyme.upper()][1])-1] += 1
majority = max(counter)
# Make dictionary
class_weights = {i: float(majority/count) for i, count in enumerate(counter)}
# Value according to mode
if mode == 'unbalanced':
for key in class_weights:
class_weights[key] = 1
elif mode == 'balanced':
pass
elif mode == 'mean_1_balanced':
for key in class_weights:
class_weights[key] = (1+class_weights[key])/2
return class_weights
and my fit_generator function:
model.fit_generator(generator=training_generator,
validation_data=validation_generator,
epochs=max_epochs,
max_queue_size=16,
class_weight=class_weights,
callbacks=[tensorboard])
Heres the IndexError message does not appear and model works perfectly without the class_weights added:
File "C:\Users\Python\DMCNN\data_generator.py", line 73, in __getitem__
X, y = self.__data_generation(list_enzymes_temp)
File "C:\Users\Python\DMCNN\data_generator.py", line 59, in __data_generation
return X, keras.utils.to_categorical(y, num_classes=self.n_classes)
File "C:\Users\Python\Anaconda3\lib\site-packages\keras\utils\np_utils.py", line 34, in to_categorical
categorical[np.arange(n), y] = 1
IndexError: index 1065353216 is out of bounds for axis 1 with size 6