This is my first time asking a question here (that's mean I'm really need help) and sorry for my bad English. I want to make a cnn-lstm layer for video classification in Keras but I have a problem on making my y_train. I will describe my problem after this. I have videos dataset (1 video has 10 frames) and I converted the videos to images. First I splited the dataset to xtrain, xtest, ytrain, and ytest (20% test, 80% train) and I did it.
X_train, X_test = img_data[:trainco], img_data[trainco:]
y_train, y_test = y[:trainco], y[trainco:]
X_train shape : (2280, 64, 64, 1) -> I have 2280 images, 64x64 height x widht, 1 channel
y_train shape : (2280, 26) -> 26 classes
And then I must reshape them before entering the cnn-lstm process. *note : I do the same thing with x_test and y_test
time_steps = 10 (because I have 10 frames per video)
X_train = X_train.reshape(int(X_train.shape[0] / time_steps), time_steps, X_train.shape[1], X_train.shape[2], X_train.shape[3])
y_train = y_train.reshape(int(y_train.shape[0] / time_steps), time_steps, y_train.shape[1])
X_train shape : (228, 10, 64, 64, 1), y_train shape : (228, 10, 26)
And then this is my model :
model = Sequential()
model.add(TimeDistributed(Conv2D(32, (3, 3), strides=(2, 2), activation='relu', padding='same'), input_shape=X_train.shape[1:]))
model.add(TimeDistributed(MaxPooling2D((2, 2), strides=(2, 2))))
model.add(TimeDistributed(Conv2D(32, (3, 3), padding='same', activation='relu')))
model.add(TimeDistributed(MaxPooling2D((2, 2), strides=(2, 2))))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(256, return_sequences=False, input_shape=(64, 64)))
model.add(Dense(128))
model.add(Dense(64))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=["accuracy"])
checkpoint = ModelCheckpoint(fname, monitor='acc', verbose=1, save_best_only=True, mode='max', save_weights_only=True)
hist = model.fit(X_train, y_train, batch_size=num_batch, nb_epoch=num_epoch, verbose=1, validation_data=(X_test, y_test), callbacks=[checkpoint])
But I got an error that says
ValueError: Error when checking target: expected dense_3 to have 2 dimensions, but got array with shape (228, 10, 26)
Like it says expected to have 2 dimensions. I changed the code to
y_train = y_train.reshape(int(y_train.shape[0] / time_steps), y_train.shape[1])
And I got an error again that says
ValueError: cannot reshape array of size 59280 into shape (228,26)
And then I change the code again to
y_train = y_train.reshape(y_train.shape[0], y_train.shape[1])
And I still got an error
ValueError: Input arrays should have the same number of samples as target arrays. Found 228 input samples and 2280 target samples.
What should I do? I know the problem but I don't know how to solve it. Please help me.