I am trying to grasp what TimeDistributed wrapper does in Keras.
I get that TimeDistributed "applies a layer to every temporal slice of an input."
But I did some experiment and got the results that I cannot understand.
In short, in connection to LSTM layer, TimeDistributed and just Dense layer bear same results.
model = Sequential()
model.add(LSTM(5, input_shape = (10, 20), return_sequences = True))
model.add(TimeDistributed(Dense(1)))
print(model.output_shape)
model = Sequential()
model.add(LSTM(5, input_shape = (10, 20), return_sequences = True))
model.add((Dense(1)))
print(model.output_shape)
For both models, I got output shape of (None, 10, 1).
Can anyone explain the difference between TimeDistributed and Dense layer after an RNN layer?
Dense
layer flattening the input and then reshaping, hence connecting different time steps and having more parameters, andTimeDistributed
keeping the time steps separated (hence having less parameters). In your caseDense
should have had 500 paramters,TimeDistributed
only 50 – gionni