Can TimeDistributed Layer used for many-to-one LSTM?

Question

In Keras, I found that many people specify "return sequences" to False when they train a many-to-one LSTM model. I wonder can I use a TimeDistributed Layer for each timestep cell and then use a dense layer above to get the output?

Yes, you can. This might be interesting for certain problems — rvinas

ixeption ixeption · Accepted Answer · 2018-07-26T16:48:35

Yes, you can do that. The question is, what you want to achieve. return_sequences does return all the hidden states, usually used to stack several LSTMs or for seq to seq predictions (many-to-many). The default value is "False" because it is not a standard use case to use the sequence output.

When you use the sequence output in a final dense layer for many-to-one predictions, it does not really help, as the LSTM should already learn to do so. Give it a try, sometimes it helps, but it´s hard to explain why.

Can TimeDistributed Layer used for many-to-one LSTM?

1 Answers