2
votes

I am trying to stack LSTM cells in TF, this is what I have:

for layer in xrange(args.num_layers):
    cell_fw = tf.contrib.rnn.LSTMCell(args.hidden_size,
                                  initializer=tf.orthogonal_initializer())

    cell_bw = tf.contrib.rnn.LSTMCell(args.hidden_size,
                                  initializer=tf.orthogonal_initializer())

    cells_fw.append(cell_fw)
    cells_bw.append(cell_bw)

output = initial_input 
for layer in xrange(args.num_layers):
    ((output_fw, output_bw), (last_state_fw, first_state_bw)) = tf.nn.bidirectional_dynamic_rnn(
        cells_fw[layer], cells_bw[layer], output,
        dtype=tf.float32)

    output = tf.concat([output_fw, output_bw], axis=-1)

This gives me an error which is:

ValueError: Variable bidirectional_rnn/fw/lstm_cell/kernel already exists, disallowed. Did you mean to set reuse=True in VarScope?

When I set it to true I get

ValueError: Variable bidirectional_rnn/fw/lstm_cell/kernel does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

Can someone tell me what I am doing wrong or what is the right way to do this.

1

1 Answers

2
votes

Usually you simply need to create RNNs in different scopes, as suggested in this issue:

with tf.variable_scope('lstm1'):
  output, state = tf.nn.rnn_cell.BasicLSTMCell(3)(input, init_state)
with tf.variable_scope('lstm2'):
  output2, state2 = tf.nn.rnn_cell.BasicLSTMCell(3)(input2, init_state2)

Note that the scope should cover RNN creation, not cell creation.

If you really need to have these RNNs in the same scope, call it with reuse=tf.AUTO_REUSE (introduced in the latest versions of tensorflow).