I am attempting to run a crowd estimation model that classifies the images into three different broad categories depending on how many people there are in the images. 1200 images are used for training, with 20% of it used for validation. I used sentdex's tutorial on Youtube as reference to load the image data into the model; I load the images as a zip file, extract it and categorise them based on the folders they are in.
My issue is that whenever I attempt to train the model, I noticed that the loss and validation loss is always 0, which has resulted in the model not exactly training and the validation accuracy remaining the same throughout, as seen here. How can I get the loss to actually change? Is there something I am doing wrong in terms of implementation?
So far, what I have attempted is:
- I tried to add a third convolutional layer, with little results.
- I have also tried to change the last Dense layer to model.add(Dense(3)), but I got an error saying "Shapes (None, 1) and (None, 3) are incompatible"
- I tried using a lower learning rate (0.001?), but the model ended up returning a 0 for validation accuracy
- Changing the optimizer did not seem to generate any changes for me
Below is a snippet of my code so far showing my model attempt:
import keras.backend as K
logdir = "logs/scalars/" + datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)
X = X/255.0
model = Sequential()
model.add(Conv2D(64, (3,3), input_shape = X.shape[1:])) #[1:] to skip the -1
model.add(Activation("relu"))
model.add(Conv2D(64, (3,3), input_shape = X.shape[1:])) #[1:] to skip the -1
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(128, (3,3)))
model.add(Activation('relu'))
model.add(Conv2D(128, (3,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
#fully connected layer
model.add(Dense(1))
model.add(Activation('softmax'))
opt = keras.optimizers.Adam(lr=0.01)
model.compile(loss='categorical_crossentropy',
optimizer = opt,
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size = 100, epochs = 30, validation_data = (x_val, y_val), callbacks=[tensorboard_callback], shuffle=True)
The full code can be found on Colab here.