1
votes

I am trying to make a sequence to sequence encoder decoder model and need to softmax the last layer to use categorical cross entropy.

I've tried setting activation of the last LSTM layer to 'softmax' but that doesn't seem to do the trick. Adding another dense layer and setting the activation to softmax doesn't help either. What is the correct way to do a softmax when your last LSTM outputs a sequence?

inputs = Input(batch_shape=(batch_size, timesteps, input_dim), name='hella')
encoded = LSTM(latent_dim, return_sequences=True, stateful=False)(inputs)
encoded = LSTM(latent_dim, return_sequences=True, stateful=False)(encoded)
encoded = LSTM(latent_dim, return_sequences=True, stateful=False)(encoded)
encoded = LSTM(latent_dim, return_sequences=False)(encoded)
decoded = RepeatVector(timesteps)(encoded)
decoded = LSTM(input_dim, return_sequences=True)(decoded)
# do softmax here
sequence_autoencoder = Model(inputs, decoded)

sequence_autoencoder.compile(loss='categorical_crossentropy', optimizer='adam')
1

1 Answers

3
votes

Figured it out:

As of Keras 2, you can simply add:

TimeDistributed(Dense(input_dim, activation='softmax'))

TimeDistributed allows you to apply a Dense layer on each temporal time step. Documentation can be found here: https://keras.io/layers/wrappers/