Keras LSTM different X timesteps to Y timesteps (e.g. learn on last 4 predict next 2)

Question

I'm having some trouble getting an LSTM model in Keras to train on the last 4 timesteps and then just predict the next 2 timesteps.

I'm sure its possible but think I'm just confusing some of the keras api.

Here is a Google Colab workbook that generates some fake data, reshapes the X and Y to be passed into the model and then trains the model.

If I set X_N_TIMESTEPS to be the same as Y_N_TIMESTEPS it trains fine - so for example use the last 4 timesteps to predict the next 4.

But I'm trying to be a bit more general and be able to train on say last 4 timesteps and then predict the next 2. The make_xy() function reshapes the data as I think it needs to. e.g.

X.shape=(1995, 4, 3)
Y.shape=(1995, 2, 3)

I think what I'm missing is telling the last Dense() layer I want it to output just 2 timesteps. The error I get is:

ValueError: Error when checking target: expected dense_1 to have shape (4, 3) but got array with shape (2, 3)

Which sort of suggests the last dense layer does not know I just want 2 timesteps even though that's what I'm passing in as the Y values.

I found this which indicated that maybe I could pass an output_dim to the last dense layer but I get an error if I try that saying I need to use keras api v2, and when I look at the docs for Dense I think the api must have changed a bit since then.

Here is all the code (in case its preferred over the colab link):

import numpy as np
import pandas as pd
from numpy import concatenate
from matplotlib import pyplot
from keras.models import Sequential
from keras.callbacks import Callback
from keras.layers import LSTM, Dense, Activation
import matplotlib.pyplot as plt

# %matplotlib inline

# define some variables
N_FEATURES = 3
X_N_TIMESTEPS = 4
Y_N_TIMESTEPS = 2
N_DATA_ORIG = 3000
N_ROLLING = 1000
N_DATA = N_DATA_ORIG - N_ROLLING

# make some noisy but smooth looking data
data = np.sqrt(np.random.rand(N_DATA_ORIG,N_FEATURES))
df_data = pd.DataFrame(data)
df_data = df_data.rolling(window=N_ROLLING).mean()
df_data = df_data.dropna()
df_data = df_data.head(N_DATA)
print(df_data.shape)
data = df_data.values
print(data.shape)
print(df_data.head())

# plot the normal healthy data
fig, ax = plt.subplots(num=None, figsize=(14, 6), dpi=80, facecolor='w', edgecolor='k')
size = len(data)
for x in range(data.shape[1]):
    ax.plot(range(0,size), data[:,x], '-', linewidth=1)

def make_xy(data,x_n_timesteps,y_n_timesteps,print_info=True):
    ''' Function to reshape the data into model ready format, either for training or prediction.
    '''
    # get original data shape
    data_shape = data.shape
    # get n_features from shape of input data
    n_features = data_shape[1]
    # loop though each row of data and reshape accordingly
    for i in range(len(data)):
        # row to start on for x
        xi = i 
        # row to start on for y
        yi = i + x_n_timesteps
        x = np.array([data[i:(i+x_n_timesteps),]])
        y = np.array([data[yi:(yi+y_n_timesteps),]])
        # only collect valid shapes
        if (x.shape == (1,x_n_timesteps,n_features)) & (y.shape == (1,y_n_timesteps,n_features)):
            # if initial data then copy else concatenate
            if i == 0:
                X = x
                Y = y
            else:
                X = np.concatenate((X,x))
                Y = np.concatenate((Y,y))
    if print_info:
        print('X.shape={}'.format(X.shape))
        print('Y.shape={}'.format(Y.shape))
    return X, Y

# build network 
model = Sequential()
model.add(LSTM(10,input_shape=(X_N_TIMESTEPS,N_FEATURES),return_sequences=True))
model.add(LSTM(10,return_sequences=True))
model.add(Dense(N_FEATURES))
model.compile(loss='mae', optimizer='adam')

# print model summary
print(model.summary())

# reshape data for training
print(f'... reshaping data for training ...')
X, Y = make_xy(data,X_N_TIMESTEPS,Y_N_TIMESTEPS)

# fit model
model.fit(X, Y)

keineahnung2345 keineahnung2345 · Accepted Answer · 2019-02-18T15:16:50

Your model output 4 timesteps, but you only want the last 2. Then you can add a Lambda layer to select from the original output:

from keras.layers import Lambda
model = Sequential()
model.add(LSTM(10,input_shape=(X_N_TIMESTEPS,N_FEATURES),return_sequences=True))
model.add(LSTM(10,return_sequences=True))
model.add(Dense(N_FEATURES))
model.add(Lambda(lambda x: x[:,-2:,:]))
model.compile(loss='mae', optimizer='adam')

New model structure:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_9 (LSTM)                (None, 4, 10)             560       
_________________________________________________________________
lstm_10 (LSTM)               (None, 4, 10)             840       
_________________________________________________________________
dense_5 (Dense)              (None, 4, 3)              33        
_________________________________________________________________
lambda_3 (Lambda)            (None, 2, 3)              0         
=================================================================
Total params: 1,433
Trainable params: 1,433
Non-trainable params: 0

Keras LSTM different X timesteps to Y timesteps (e.g. learn on last 4 predict next 2)

1 Answers