I was wondering how was working LSTM under Keras.
Let's take an example. I have maximum sentence length of 3 words. Example : 'how are you' I vectorize each words in a vector of len 4. So I will have a shape (3,4) Now, I want to use an lstm to do translation stuff. (Just an example)
model = Sequential()
model.add(LSTM(1, input_shape=(3,4), return_sequences=True))
model.summary()
I'm going to have an output shape of (3,1) according to Keras.
Layer (type) Output Shape Param #
=================================================================
lstm_16 (LSTM) (None, 3, 1) 24
=================================================================
Total params: 24
Trainable params: 24
Non-trainable params: 0
_________________________________________________________________
And this is what I don't understand.
Each unit of an LSTM (With return_sequences=True to have all the output of each state) should give me a vector of shape (timesteps, x) Where timesteps is 3 in this case, and x is the size of my words vector (In this case, 4)
So, why I got an output shape of (3,1) ? I searched everywhere, but can't figure it out.