How to use Conv1D and Bidirectional LSTM in keras to do multiclass classification of each timestep?

Question

I am trying to use a Conv1D and Bidirectional LSTM in keras (much like in this question) for signal processing, but doing a multiclass classification of each time step.

The problem is that even though the shapes used by Conv1D and LSTM are somewhat equivalent:

Conv1D: (batch, length, channels)
LSTM: (batch, timeSteps, features)

The output of the Conv1D is = (length - (kernel_size - 1)/strides), and therefore doesn't match the LSTM shape anymore, even without using MaxPooling1D and Dropout.

To be more specific, my training set X has n samples with 1000 time steps and one channel (n_samples, 1000, 1), and I used LabelEncoder and OneHotEncoder so y has n samples, 1000 time steps and 5 one hot encoded classes (n_samples, 1000, 5).

Since one class is much more prevalent than the others (is actually the absence of signal), I am using loss='sparse_categorical_crossentropy', sample_weight_mode="temporal" and sample_weight to give a higher weight to time steps containing meaningful classes.

model = Sequential()
model.add(Conv1D(128, 3, strides=1, input_shape = (1000, 1), activation = 'relu'))
model.add(Bidirectional(LSTM(128, return_sequences=True)))
model.add(TimeDistributed(Dense(5, activation='softmax')))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'], sample_weight_mode="temporal")
print(model.summary())

Model

When I try to fit the model I get this error message:

Error when checking target: expected time_distributed_1 to have shape (None, 998, 1) but got array with shape (100, 1000, 5).

Is there a way to make such a neural network configuration work?

Daniel Möller Daniel Möller · Accepted Answer · 2017-12-01T15:11:48

Your convolution is cutting the tips of the sequence. Use padding='same' in the convolutional layers.

The message, though, seems not to fit your model. Your model clearly has 5 output features (because of Dense(5)), but the massage says it expects 1. Maybe this is happening because of "sparse" crossentropy. You should probably, by the format of your data, use a "categorical_crossentropy".

How to use Conv1D and Bidirectional LSTM in keras to do multiclass classification of each timestep?

1 Answers