Keras LSTM with embedding layer before LSTM layer

Question

I am trying the example of keras IMDB data and the data shape is like this:

x_train shape: (25000, 80)

I simply change the original code of keras example to code like this:

model = Sequential()
layer1 = Embedding(max_features, 128)
layer2 = LSTM(128, dropout = 0.2, recurrent_dropout = 0.2, return_sequences = True)
layer3 = Dense(1, activation = 'sigmoid')
model.add(layer1)
model.add(layer2)
model.add(layer3)

The original model set return_sequences as False and I changed it into True, and I met this error:

expected dense_1 to have 3 dimensions, but got array with shape (25000, 1)

But I printed the structure of the model and found the output of LSTM layer is exactly a 3D tensor:

lstm_1 (LSTM): (None, None, 128)

When you set return_sequences to true, you now have a many to many relationship. So each word in the sentence has an output value, vs just one output value originally when set to false, which is why you need a third dimension in the last layers output data — DJK
Yes, and the summary of the model can show that the output of the LSTM layer is (None, None, 128), but when it comes to fitting, it becomes (25000, 1), which is quite odd. — BridgeMia
you can use a keras reshape layer. I asked a similar question a while back, and the answer is exactly what your looking for. — DJK
actually, I used a Flatten layer to solved the problem, and besides a reshape layer, a TimeDistributedDense layer after the LSTM layer, but the out put of this layer is still 80D vector so you still need a Flatten layer to connect it and the last Dense layer — BridgeMia
You wouldn’t use a flatten layer before the last dense, this is altered by changing the Boolean value of return_sequences — DJK

Bhushan Pant Bhushan Pant · Accepted Answer · 2017-11-10T06:58:04

You need to reshape your training array, use the below code:

x_train = np.reshape(x_train,(x_train.shape[0],1,x_train.shape[1]))

Also your testing array:

x_test = np.reshape(x_test,(x_test.shape[0],1,x_test.shape[1]))

FYI: np is numpy pacakge.

Timesteps in LSTM models: https://machinelearningmastery.com/use-timesteps-lstm-networks-time-series-forecasting/

Timesteps: This is equivalent to the amount of time steps you run your recurrent neural network. If you want your network to have memory of 60 characters, this number should be 60.

Keras LSTM with embedding layer before LSTM layer

2 Answers