2
votes

I'm new to Keras and wondering how to train an LTSM with (interrupted) time series of different lengths. Consider, for example, a continuous series from day 1 to day 10 and another continuous series from day 15 to day 20. Simply concatenating them to a single series might yield wrong results. I see basically two options to bring them to shape (batch_size, timesteps, output_features):

  1. Extend the shorter series by some default value (0), i.e. for the above example we would have the following batch:

    d1, ..., d10
    d15, ..., d20, 0, 0, 0, 0, 0
    
  2. Compute the GCD of the lengths, cut the series into pieces, and use a stateful LSTM, i.e.:

    d1, ..., d5
    d6, ..., d10
    reset_state
    d15, ..., d20
    

Are there any other / better solutions? Is training a stateless LSTM with a complete sequence equivalent to training a stateful LSTM with pieces?

1

1 Answers

7
votes

Have you tried feeding the LSTM layer with inputs of different length? The input time-series can be of different length when LSTM is used (even the batch sizes can be different from one batch to another, but obvisouly the dimension of features should be the same). Here is an example in Keras:

from keras import models, layers

n_feats = 32
latent_dim = 64

lstm_input = layers.Input(shape=(None, n_feats))
lstm_output = layers.LSTM(latent_dim)(lstm_input)

model = models.Model(lstm_input, lstm_output)
model.summary()

Output:

Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         (None, None, 32)          0         
_________________________________________________________________
lstm_2 (LSTM)                (None, 64)                24832     
=================================================================
Total params: 24,832
Trainable params: 24,832
Non-trainable params: 0

As you can see the first and second axis of Input layer is None. It means they are not pre-specified and can be any value. You can think of LSTM as a loop. No matter the input length, as long as there are remaining data vectors of same length (i.e. n_feats), the LSTM layer processes them. Therefore, as you can see above, the number of parameters used in a LSTM layer does not depend on the batch size or time-series length (it only depends on input feature vector's length and the latent dimension of LSTM).

import numpy as np

# feed LSTM with: batch_size=10, timestamps=5
model.predict(np.random.rand(10, 5, n_feats))   # This works

# feed LSTM with: batch_size=5, timestamps=100
model.predict(np.random.rand(5, 100, n_feats))  # This also works

However, depending on the specific problem you are working on, this may not work; though I don't have any specific examples in my mind now in which this behavior may not be suitable and you should make sure all the time-series have the same length.