RNN and CNN on activity recognition in Tensorflow

Question

I have a CNN for activity recognition using 3 sensors. I stacked the dimensions of the sensors giving me 9 channels and divided the time series data into 200 samples per window. I fed it to a 2 CNN layers, 1 fully connected layer and 1 softmax layer. All in Tensorflow

Now I want to replace the fully connected layers with LSTM layers. But I don't know how to implement it. If I have the flattened output from my last convolutional layer how do I feed it into an LSTM layer? How do I apply dropout?

Because I saw a Github code on LSTM fro activity recognition and the input is

x = tf.placeholder(tf.float32, [None, n_steps, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])

But my flattened output from the last layer is only 2d (-1, N). The n_steps is the number of temporal steps right? Should I reshape my flattened output? How should I reshape it? I believe that n_steps in the LSTM code from Github refers to the the number of samples per window. So should I segment the flattened output into 200 samples per window again?

EDIT: What I want to do is to divide the time series data into slices or time windows, then apply convolutional layers, then flatten this and input it to an LSTM layer. But I don't know how to implement this. Especially when I'm already in the flattened output. How will I segment it and feed it to the recurrent layer?

Giuseppe Marra Giuseppe Marra · Accepted Answer · 2017-06-18T07:28:49

LSTMs are architecture used with sequences of data. Using convolutions on your time dimension makes you losing this time dimension making the use of LSTMs less meaningful.

What I would personally do is replacing the CNN layer with the LSTM one, since both are used to make some aggregation of evidences in the time dimension. In this case I think the answer to your question is clear: n_steps are the time steps of your data.

If you still want to apply LSTMs on top of a convolution, then you should design some kind of higher level sequence. One possibility is to provide different windows to the convolutional layer and then use their outputs as input sequence of the LSTM. Obviously this is only a "trick" and you should find good motivations for doing this.

RNN and CNN on activity recognition in Tensorflow

1 Answers