1
votes

I am trying to implement a Seq2Seq variant in Tensorflow, which includes two encoders and a decoder. For the encoders' first layer, I have bidirectional LSTMs. So I have implemented this method for getting bidirectional LSTMs for variable number of layers:

def bidirectional_lstm(batch, num_layers=2, hidden_layer=256):


forward_lstms=[LSTMCell(num_units=hidden_layer/2) for _ in range(num_layers)]
backward_lstms=[LSTMCell(num_units=hidden_layer/2) for _ in range(num_layers)]

states_fw=[f_l.zero_state(BATCH_SIZE, tf.float64) for f_l in forward_lstms]
states_bw=[b_l.zero_state(BATCH_SIZE, tf.float64) for b_l in backward_lstms]


outputs, final_state_fw, final_state_bw=tf.contrib.rnn.stack_bidirectional_dynamic_rnn(
    forward_lstms,
    backward_lstms,
    batch,
    initial_states_fw=states_fw,
    initial_states_bw=states_bw,
    parallel_iterations=32
)

return outputs

But when I run the lines below:

a=bidirectional_lstm(a_placeholder)

b=bidirectional_lstm(b_placeholder, num_layers=1)

I get this error message:

ValueError
Variable 
stack_bidirectional_rnn/cell_0/bidirectional_rnn/fw/lstm_cell/kernel 
already exists, disallowed. Did you mean to set reuse=True or 
reuse=tf.AUTO_REUSE in VarScope? Originally defined at:    File 
"/usr/local/lib/python3.6/dist- 
packages/tensorflow/contrib/rnn/python/ops/rnn.py", line 233, in 
stack_bidirectional_dynamic_rnn     time_major=time_major)

I do not want to "reuse" a given stacked bidirectional LSTM. How can I run two separate encoders containing two stacked bidirectional LSTMs?

1

1 Answers

0
votes

Figured it out: The two encoders need to "run" in two different variable scopes to avoid "mixup" during gradient updates

with tf.variable_scope("a"):
    a=bidirectional_lstm(a_placeholder)


with tf.variable_scope("b"):
    b=bidirectional_lstm(b_placeholder)