I'm working on an prediction project using lstm model in TensorFlow. The structure of the implementation worked, however, got a bad result which the accuracy of testing set was only 0.5. Thus, I have searched whether there exists some tricks of training a lstm-based model. Then I got "adding dropout".
However, following the tutorial by others, some errors occur.
Here's the original version and it worked :
def lstmModel(x, weights, biases):
x = tf.unstack(x, time_step, 1)
lstm_cell = tf.nn.rnn_cell.LSTMCell(n_hidden, state_is_tuple=True, forget_bias=1)
outputs, states = rnn.static_rnn (lstm_cell, x, dtype=tf.float32)rnn.static_rnn)
return tf.matmul(outputs[-1], weights['out']) + biases['out']
and after changing to below, it occurs an error :
ValueError: Shape (90, ?) must have rank at least 3
def lstmModel(x, weights, biases):
x = tf.unstack(x, time_step, 1)
lstm_cell = tf.nn.rnn_cell.LSTMCell(n_hidden, state_is_tuple=True, forget_bias=1)
lstm_dropout = tf.nn.rnn_cell.DropoutWrapper(lstm_cell, output_keep_prob=0.5)
lstm_layers = rnn.MultiRNNCell([lstm_dropout]* 3)
outputs, states = tf.nn.dynamic_rnn(lstm_layers, x, dtype=tf.float32)
return tf.matmul(outputs[-1], weights['out']) + biases['out']
I'm confused if my shape of input data went wrong.
Before entering this function, the input x
is in the shape (batch_size, time_step, data_size)
batch_size = 30
time_step = 4 #read 4 words
data_size = 80 # total 80 words, each is in np.shape of [1,80]
So, the input shape x
each batch is [30,4,80]
.
And the input word x[0,0,80]
is followed by the word x[0,1,80]
.
Does the design make sense ?
The whole implementation is actually modified by other tutorial and I also wonder what did the tf.unstack()
actually do?
several problems above... I have putted the code in github with "worked version" and "failed version" mentioned above. Only the mentioned function differs! Please check, thanks!