Goal
Trying to run an LSTM autoencoder over a dataset of multi variate time series:
X_train (200, 23, 178)
- X_val (100, 23, 178)
- X_test (100, 23, 178)
Current situation
A plain autoencoder gets better results rather than a simple architecture of a LSTM AE.
I have some doubts about how I use the Repeat Vector wrapper layer which, as far as I understood, is supposed to simply repeat a number of times equal to the sequence length the last state of the LSTM/GRU cell, in order to feed the input shape of the decoder layer.
The model architecture does not rise any error, but still the results are an order of magnitude worst than a simple AE, while I was expecting them to be at least the same, as I am using an architecture which should better fit the temporal problem.
Are these results comparable, first of all?
Nevertheless, the reconstruction error of the LSTM-AE does not look good at all.
My AE model:
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 178) 31862
_________________________________________________________________
batch_normalization (BatchNo (None, 178) 712
_________________________________________________________________
dense_1 (Dense) (None, 59) 10561
_________________________________________________________________
dense_2 (Dense) (None, 178) 10680
=================================================================
- optimizer: sgd
- loss: mse
- activation function of the dense layers: relu
My LSTM/GRU AE:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 23, 178) 0
_________________________________________________________________
gru (GRU) (None, 59) 42126
_________________________________________________________________
repeat_vector (RepeatVector) (None, 23, 59) 0
_________________________________________________________________
gru_1 (GRU) (None, 23, 178) 127092
_________________________________________________________________
time_distributed (TimeDistri (None, 23, 178) 31862
=================================================================
- optimizer: sgd
- loss: mse
- activation function of the gru layers: relu