2
votes

I am implementing an encoder decoder model using bidirectional RNN for both encoder and decoder. Since I initialize the bidirectional RNN on the encoder side and the weights and vectors associated with the bidirectional RNN is already initialized, I get the following error when I try to initialize another instance on the decoder side:

ValueError: Variable bidirectional_rnn/fw/gru_cell/w_ru already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope?

I tried defining each within it's own name_scope like below but to no avail:

def enc(message, weights, biases):
    message = tf.unstack(message, timesteps_enc, 1)
    fw_cell = rnn.GRUBlockCell(num_hidden_enc)
    bw_cell = rnn.GRUBlockCell(num_hidden_enc)
    with tf.name_scope("encoder"):
        outputs, _, _ = rnn.static_bidirectional_rnn(fw_cell, bw_cell, message, dtype=tf.float32)
    return tf.matmul(outputs[-1], weights) + biases


def dec(codeword, weights, biases):
    codeword = tf.expand_dims(codeword, axis=2)
    codeword = tf.unstack(codeword, timesteps_dec, 1)
    fw_cell = rnn.GRUBlockCell(num_hidden_dec)
    bw_cell = rnn.GRUBlockCell(num_hidden_dec)
    with tf.name_scope("decoder"):
        outputs, _, _ = rnn.static_bidirectional_rnn(fw_cell, bw_cell, codeword, dtype=tf.float32)
    return tf.matmul(outputs[-1], weights) + biases

Can someone please hint at what I am doing wrong?

2
Just try to exchange name_scope for variable_scope. I'm not sure if it is still valid, but for older versions of TF, usage of name_scope was not encouraged. From your variable name bidirectional_rnn/fw/gru_cell/w_ru you can see that the scope is not applied.carobnodrvo
Thanks for your response, that worked. Can you please put that as an answer so that I can mark it as correct?learner

2 Answers

1
votes

Just putting it as an answer:

Just try to exchange name_scope for variable_scope. I'm not sure if it is still valid, but for older versions of TF, usage of name_scope was not encouraged. From your variable name bidirectional_rnn/fw/gru_cell/w_ru you can see that the scope is not applied.

1
votes

One thing is that you cannot create variables with the same name in the same scope, so changing name_scope for variable_scope will fix the training.

The other thing is that such a model cannot work as an encoder-decoder model because the decoder RNN cannot be bidirectional. You indeed have the entire target sequences at the training time, but at the inference time, you generate the target left-to-right. This means you only have the left context for the forward RNN, but you don't have the right context for the backward RNN.