2
votes

I am trying the example of keras IMDB data and the data shape is like this:

x_train shape: (25000, 80)

I simply change the original code of keras example to code like this:

model = Sequential()
layer1 = Embedding(max_features, 128)
layer2 = LSTM(128, dropout = 0.2, recurrent_dropout = 0.2, return_sequences = True)
layer3 = Dense(1, activation = 'sigmoid')
model.add(layer1)
model.add(layer2)
model.add(layer3)

The original model set return_sequences as False and I changed it into True, and I met this error:

expected dense_1 to have 3 dimensions, but got array with shape (25000, 1)

But I printed the structure of the model and found the output of LSTM layer is exactly a 3D tensor:

lstm_1 (LSTM): (None, None, 128)

2
When you set return_sequences to true, you now have a many to many relationship. So each word in the sentence has an output value, vs just one output value originally when set to false, which is why you need a third dimension in the last layers output dataDJK
Yes, and the summary of the model can show that the output of the LSTM layer is (None, None, 128), but when it comes to fitting, it becomes (25000, 1), which is quite odd.BridgeMia
you can use a keras reshape layer. I asked a similar question a while back, and the answer is exactly what your looking for.DJK
actually, I used a Flatten layer to solved the problem, and besides a reshape layer, a TimeDistributedDense layer after the LSTM layer, but the out put of this layer is still 80D vector so you still need a Flatten layer to connect it and the last Dense layerBridgeMia
You wouldn’t use a flatten layer before the last dense, this is altered by changing the Boolean value of return_sequencesDJK

2 Answers

0
votes

You need to reshape your training array, use the below code:

x_train = np.reshape(x_train,(x_train.shape[0],1,x_train.shape[1]))

Also your testing array:

x_test = np.reshape(x_test,(x_test.shape[0],1,x_test.shape[1]))

FYI: np is numpy pacakge.

Timesteps in LSTM models: https://machinelearningmastery.com/use-timesteps-lstm-networks-time-series-forecasting/

Timesteps: This is equivalent to the amount of time steps you run your recurrent neural network. If you want your network to have memory of 60 characters, this number should be 60.

0
votes

I think that you need a TimeDistributed layer after a LSTM with return_sequences=True

layer2= LSTM(128, dropout=0.2, 
             recurrent_dropout=0.2,return_sequences=True)
layer3= TimeDistributed(Dense(1, activation='sigmoid')))