I would like to implement LSTM for multivariate input in Pytorch.
Following this article https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/ which uses keras, the input data are in shape of (number of samples, number of timesteps, number of parallel features)
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
. . .
Input Output
[[10 15]
[20 25]
[30 35]] 65
[[20 25]
[30 35]
[40 45]] 85
[[30 35]
[40 45]
[50 55]] 105
[[40 45]
[50 55]
[60 65]] 125
[[50 55]
[60 65]
[70 75]] 145
[[60 65]
[70 75]
[80 85]] 165
[[70 75]
[80 85]
[90 95]] 185
n_timesteps = 3
n_features = 2
In keras it seems to be easy:
model.add(LSTM(50, activation='relu', input_shape=(n_timesteps, n_features)))
Can it be done in other way, than creating n_features
of LSTMs as first layer and feed each separately (imagine as multiple streams of sequences) and then flatten their output to linear layer?
I'm not 100% sure but by nature of LSTM the input cannot be flattened and passed as 1D array, because each sequence "plays by different rules" which the LSTM is supposed to learn.
So how does such implementation with keras equal to PyTorch
input of shape (seq_len, batch, input_size)
(source https://pytorch.org/docs/stable/nn.html#lstm)
Edit:
Can it be done in other way, than creating
n_features
of LSTMs as first layer and feed each separately (imagine as multiple streams of sequences) and then flatten their output to linear layer?
According to PyTorch docs the input_size parameter actually means number of features (if it means number of parallel sequences)