I'm trying to classify videos into 10 categories. I have this model so far:
model=keras.models.Sequential()
model.add(keras.layers.TimeDistributed(keras.layers.Conv2D(filters=3,kernel_size=(5, 5),activation='relu'),input_shape=(None,90,90,3)))
model.add(keras.layers.TimeDistributed(keras.layers.Flatten()))
model.add(keras.layers.LSTM(20,activation='relu'))
model.add(keras.layers.Dense(10, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(),
metrics=['accuracy'])
The problem is that each video has different number of frames where each frame is 90x90 3 channels. I have observed that first input in input_shape represent the number of frames in each video but I have different amount of frames in each video. How can I train this model on these videos? I have videos loaded in following format: [[image1ofvideo1,image2ofvideo1],[image1ofvideo2,image2ofvideo2,image3ofvideo2],[image1ofvideo3]]. If I try to train like this, I get error. Also numpy doesn't support varying length. I would also like to avoid adding black frames to make equal length videos.