3
votes

I am trying to create a multi-layer recurrent neural network with GRU units (as well be LSTM units) in tensorflow. I have looked at multiple sources, including the official tutorial. But I keep seeing the following pattern for multi-layer RNNs (shown here with GRU units).

cell = GRUCell(num_neurons)  # Or LSTMCell(num_neurons)
cell = DropoutWrapper(cell, output_keep_prob=dropout)
cell = MultiRNNCell([cell] * num_layers)

This code snippet is taken from RNN Tensorflow Introduction. My understanding of [cell] * num_layers is that the same object cell gets referenced num_layer times in the list. So, won't the MultiRNNCell be creating a deep network where the each layer has the same weights as the previous ones. If someone can clarify what's exactly happening here, it would be very insightful.

1
Initially, yes. The weights are the same per layer of the GRU / LSTM units. But as the neural net learns, those weights shall be updated. That's why when you create a stacked RNN (GRU / LSTM), you have to have tf.nn.dynamic_rnn(cell, x, initial_state=Hin). From what I understand, the Hin will carry the states of the GRU / LSTM units, which means, it also has the weights per layer.afagarap

1 Answers

2
votes

I am assuming you understand the concepts and execution model of TensorFlow well already. If not please check the tutorials on tensorflow.org in particular the Variables one.

The constructor of GRUCell doesn't add any nodes to the Graph. Only when you call the instance (i.e. cell()) variables and operations will be added to the graph. When MultiRNNCell gets called it will create different Variable Scopes before it calls the sublayers. This way each layer has it's own variables.