How to classify videos of different length using CNN-LSTM?

Question

I'm trying to classify videos into 10 categories. I have this model so far:

model=keras.models.Sequential()
model.add(keras.layers.TimeDistributed(keras.layers.Conv2D(filters=3,kernel_size=(5, 5),activation='relu'),input_shape=(None,90,90,3)))
model.add(keras.layers.TimeDistributed(keras.layers.Flatten()))
model.add(keras.layers.LSTM(20,activation='relu'))
model.add(keras.layers.Dense(10, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
          optimizer=keras.optimizers.Adam(),
          metrics=['accuracy'])

The problem is that each video has different number of frames where each frame is 90x90 3 channels. I have observed that first input in input_shape represent the number of frames in each video but I have different amount of frames in each video. How can I train this model on these videos? I have videos loaded in following format: [[image1ofvideo1,image2ofvideo1],[image1ofvideo2,image2ofvideo2,image3ofvideo2],[image1ofvideo3]]. If I try to train like this, I get error. Also numpy doesn't support varying length. I would also like to avoid adding black frames to make equal length videos.

chamith mawela chamith mawela · Accepted Answer · 2019-12-08T17:53:56

You can use a data generator and dynamically feed frames without saving the data as a numpy array before hand so you can create data on the fly. This is a great article on how you can implement a custom data generator

How to classify videos of different length using CNN-LSTM?

1 Answers