How to train different LSTM on the same tensorflow session?

Question

I would like to train two different LSTMs to make them interact in a dialogue context (ie one rnn generate a sequence, which will be used as a context for the second rnn, which will answer, etc...). However, I do not know how to train them separately on tensorflow (I think that I did not fully understand the logic behind tf graphs). When I execute my code, I get the following error:

Variable rnn/basic_lstm_cell/weights already exists, disallowed. Did you mean to set reuse=True in VarScope?

The error happens when I create my second RNN. Do you know how to fix this ?

My code is the following:

#User LSTM
no_units=100
_seq_user = tf.placeholder(tf.float32, [batch_size, max_length_user, user_inputShapeLen], name='seq')
_seq_length_user = tf.placeholder(tf.int32, [batch_size], name='seq_length')

cell = tf.contrib.rnn.BasicLSTMCell(
        no_units)

output_user, hidden_states_user = tf.nn.dynamic_rnn(
    cell,
    _seq_user,
    dtype=tf.float32,
    sequence_length=_seq_length_user
)
out2_user = tf.reshape(output_user, shape=[-1, no_units])
out2_user =  tf.layers.dense(out2_user, user_outputShapeLen)

out_final_user = tf.reshape(out2_user, shape=[-1, max_length_user, user_outputShapeLen])
y_user_ = tf.placeholder(tf.float32, [None, max_length_user, user_outputShapeLen])


softmax_user = tf.nn.softmax(out_final_user, dim=-1)  
loss_user = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=out_final_user, labels=y_user_))
optimizer = tf.train.AdamOptimizer(learning_rate=10**-4)
minimize = optimizer.minimize(loss_user)

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

for i in range(epoch):
    print 'Epoch: ', i
    batch_X, batch_Y, batch_sizes = lstm.batching(user_train_X, user_train_Y, sizes_user_train)
    for data_, target_, size_ in zip(batch_X, batch_Y, batch_sizes):
        sess.run(minimize, {_seq_user:data_, _seq_length_user:size_, y_user_:target_})

#System LSTM
no_units_system=100
_seq_system = tf.placeholder(tf.float32, [batch_size, max_length_system, system_inputShapeLen], name='seq_')
_seq_length_system = tf.placeholder(tf.int32, [batch_size], name='seq_length_')

cell_system = tf.contrib.rnn.BasicLSTMCell(
        no_units_system)

output_system, hidden_states_system = tf.nn.dynamic_rnn(
    cell_system,
    _seq_system,
    dtype=tf.float32,
    sequence_length=_seq_length_system
)
out2_system = tf.reshape(output_system, shape=[-1, no_units])
out2_system =  tf.layers.dense(out2_system, system_outputShapeLen)

out_final_system = tf.reshape(out2_system, shape=[-1, max_length_system, system_outputShapeLen])
y_system_ = tf.placeholder(tf.float32, [None, max_length_system, system_outputShapeLen])

softmax_system = tf.nn.softmax(out_final_system, dim=-1)  
loss_system = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=out_final_system, labels=y_system_))
optimizer = tf.train.AdamOptimizer(learning_rate=10**-4)
minimize = optimizer.minimize(loss_system)

for i in range(epoch):
    print 'Epoch: ', i
    batch_X, batch_Y, batch_sizes = lstm.batching(system_train_X, system_train_Y, sizes_system_train)
    for data_, target_, size_ in zip(batch_X, batch_Y, batch_sizes):
        sess.run(minimize, {_seq_system:data_, _seq_length_system:size_, y_system_:target_})

J-min J-min · Accepted Answer · 2017-04-23T06:37:47

Regarding the variable scope error, try setting different variable scope for each graph.

with tf.variable_scope('User_LSTM'): your user_lstm graph

with tf.variable_scope('System_LSTM'): your system_lstm graph

Also, avoid using same names for different python objects. (ex.optimizer) The second declaration will override the first declaration, which will confuse you when you use tensorboard.

By the way, I would recommend training the model end-to-end fashion rather than running two sessions separately. Try feeding the output tensor of the first LSTM into the second LSTM with single optimizer and loss function.

How to train different LSTM on the same tensorflow session?

3 Answers