3
votes

I am trying to model LSTM-VAE for time series reconstruction using Keras.

I had referred to https://github.com/twairball/keras_lstm_vae/blob/master/lstm_vae/vae.py and https://machinelearningmastery.com/lstm-autoencoders/ for creating the LSTM-VAE architecture.

I have trouble training the network, I get the following error while training in eager execution mode:

  InvalidArgumentError: Incompatible shapes: [8,1] vs. [32,1] [Op:Mul]

Input shape is (7752,30,1) here 30 time steps and 1 feature.

Model Encoder:

# encoder
latent_dim = 1
inter_dim = 32

#sample,timesteps, features
input_x = keras.layers.Input(shape= (X_train.shape[1], X_train.shape[2])) 

#intermediate dimension 
h = keras.layers.LSTM(inter_dim)(input_x)

#z_layer
z_mean = keras.layers.Dense(latent_dim)(h)
z_log_sigma = keras.layers.Dense(latent_dim)(h)
z = Lambda(sampling)([z_mean, z_log_sigma])

Model Decoder:

# Reconstruction decoder
decoder1 = RepeatVector(X_train.shape[1])(z)
decoder1 = keras.layers.LSTM(100, activation='relu', return_sequences=True)(decoder1)
decoder1 = keras.layers.TimeDistributed(Dense(1))(decoder1)

Sampling function:

batch_size = 32
def sampling(args):
    z_mean, z_log_sigma = args
    epsilon = K.random_normal(shape=(batch_size, latent_dim),mean=0., stddev=1.)
    return z_mean + z_log_sigma * epsilon

VAE loss function:

def vae_loss2(input_x, decoder1):
    """ Calculate loss = reconstruction loss + KL loss for each data in minibatch """
    # E[log P(X|z)]
    recon = K.sum(K.binary_crossentropy(input_x, decoder1), axis=1)
    # D_KL(Q(z|X) || P(z|X)); calculate in closed form as both dist. are Gaussian
    kl = 0.5 * K.sum(K.exp(z_log_sigma) + K.square(z_mean) - 1. - z_log_sigma, axis=1)

    return recon + kl

LSTM-VAE model architecture

Any suggestions to make the model work?

1
VAE LSTM for time-series: towardsdatascience.com/…Marco Cerliani

1 Answers

1
votes

you need to infer the batch_dim inside the sampling function and you need to pay attention to your loss... your loss function uses the output of previous layers so you need to take care of this. I implement this using model.add_loss(...)

# encoder
latent_dim = 1
inter_dim = 32
timesteps, features = 100, 1

def sampling(args):
    z_mean, z_log_sigma = args
    batch_size = tf.shape(z_mean)[0] # <================
    epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0., stddev=1.)
    return z_mean + z_log_sigma * epsilon

# timesteps, features
input_x = Input(shape= (timesteps, features)) 

#intermediate dimension 
h = LSTM(inter_dim, activation='relu')(input_x)

#z_layer
z_mean = Dense(latent_dim)(h)
z_log_sigma = Dense(latent_dim)(h)
z = Lambda(sampling)([z_mean, z_log_sigma])

# Reconstruction decoder
decoder1 = RepeatVector(timesteps)(z)
decoder1 = LSTM(inter_dim, activation='relu', return_sequences=True)(decoder1)
decoder1 = TimeDistributed(Dense(features))(decoder1)

def vae_loss2(input_x, decoder1, z_log_sigma, z_mean):
    """ Calculate loss = reconstruction loss + KL loss for each data in minibatch """
    # E[log P(X|z)]
    recon = K.sum(K.binary_crossentropy(input_x, decoder1))
    # D_KL(Q(z|X) || P(z|X)); calculate in closed form as both dist. are Gaussian
    kl = 0.5 * K.sum(K.exp(z_log_sigma) + K.square(z_mean) - 1. - z_log_sigma)

    return recon + kl

m = Model(input_x, decoder1)
m.add_loss(vae_loss2(input_x, decoder1, z_log_sigma, z_mean)) #<===========
m.compile(loss=None, optimizer='adam')

here the running notebook