0
votes

I´m currently trying to make my first steps using Keras on top of Tensorflow to classify timeseries data. I was able to get a pretty simple model running but after some feedback it was recommended to me to use multiple GRU layers in a row and add the TimeDistributed wrapper around my Dense layers. Here is the model I was trying:

model = Sequential()
model.add(GRU(100, input_shape=(n_timesteps, n_features), return_sequences=True, dropout=0.5))
model.add(GRU(100, return_sequences=True, go_backwards=True, dropout=0.5))
model.add(GRU(100, return_sequences=True, go_backwards=True, dropout=0.5))
model.add(GRU(100, return_sequences=True, go_backwards=True, dropout=0.5))
model.add(GRU(100, return_sequences=True, go_backwards=True, dropout=0.5))
model.add(GRU(100, return_sequences=True, go_backwards=True, dropout=0.5))
model.add(TimeDistributed(Dense(units=100, activation='relu')))
model.add(TimeDistributed(Dense(n_outputs, activation='softmax')))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

I am receiving the following error message when trying to fit the model with the input having a shape of (2357, 128, 11) (2357 samples, 128 timesteps, 11 features):

ValueError: Error when checking target: expected time_distributed_2 to have 3 dimensions, but got array with shape (2357, 5)

This is the output for model.summary():

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
gru_1 (GRU)                  (None, 128, 100)          33600     
_________________________________________________________________
gru_2 (GRU)                  (None, 128, 100)          60300     
_________________________________________________________________
gru_3 (GRU)                  (None, 128, 100)          60300     
_________________________________________________________________
gru_4 (GRU)                  (None, 128, 100)          60300     
_________________________________________________________________
gru_5 (GRU)                  (None, 128, 100)          60300     
_________________________________________________________________
gru_6 (GRU)                  (None, 128, 100)          60300     
_________________________________________________________________
time_distributed_1 (TimeDist (None, 128, 100)          10100     
_________________________________________________________________
time_distributed_2 (TimeDist (None, 128, 5)            505       
=================================================================
Total params: 345,705
Trainable params: 345,705
Non-trainable params: 0

So what is the correct way to put multiple GRU layers in a row and add the TimeDistributed Wrapper to the following Dense layers. I will be very grateful for any helpful input

1

1 Answers

0
votes

If you set return_sequences = False in your last layer of GRU, the code will work.

You only need to put return_sequences = True in case the output of a RNN is fed to an input again to a RNN, hence to preserve the time dimensionality space. When you set return_sequences = False, this means that the output will be only the last hidden state (instead of hidden state at every time step), and the time dimensionality will disappear.

That is why when you set return_sequnces = False, the output dimensionality decreases from N to N-1.