7
votes

I have managed to train a word2vec with tensorflow and I want to feed those results into an rnn with lstm cells for sequence labeling.

1) It's not really clear on how to use your trained word2vec model for a rnn. (How to feed the result?)

2) I don't find much documentation on how to implement a sequence labeling lstm. (How do I bring in my labels?)

Could someone point me in the right direction on how to start with this task?

2

2 Answers

3
votes

I realize this was posted a while ago, but I found this Gist about sequence labeling and this Gist about variable sequence labeling really helpful for figuring out sequence labeling. The basic outline (the gist of the Gist):

  1. Use dynamic_rnn to handle unrolling your network for training and prediction. This method has moved around some in the API, so you may have to find it for your version, but just Google it.
  2. Arrange your data into batches of size [batch_size, sequence_length, num_features], and your labels into batches of size [batch_size, sequence_length, num_classes]. Note that you want a label for every time step in your sequence.
  3. For variable-length sequences, pass a value to the sequence_length argument of the dynamic_rnn wrapper for each sequence in your batch.
  4. Training the RNN is very similar to training any other neural network once you have the network structure defined: feed it training data and target labels and watch it learn!

And some caveats:

  1. With variable-length sequences, you will need to build masks for calculating your error metrics and stuff. It's all in the second link above, but don't forget when you make your own error metrics! I ran in to this a couple of times and it made my networks look like they were doing much worse on variable-length sequences.
  2. You might want to add a regularization term to your loss function. I had some convergence issues without this.
  3. I recommend using tf.train.AdamOptimizer with the default settings at first. Depending on your data, this may not converge and you will need to adjust the settings. This article does a good job of explaining what the different knobs do. Start reading from the beginning, some of the knobs are explained before the Adam section.

Hopefully these links are helpful to others in the future!

6
votes

I suggest you start by reading the RNN tutorial and sequence-to-sequence tutorial. They explain how to build LSTMs in TensorFlow. Once you're comfortable with that, you'll have to find the right embedding Variable and assign it using your pre-trained word2vec model.