1
votes

I want to build a recurrent neural network (RNN) in TensorFlow that predicts the next word in a sequence of words. I have looked at several tutorials, e.g. the one of TensorFlow. I know that each word in the training text(s) is mapped to an integer index. However there are still a few things about the input that I don't get:

  1. Networks are trained with batches, e.g. with 128 examples at the same time. Let's say we have 10.000 words in our vocabulary. Is the input to the network a matrix of size (128, sequence_length) or a one-hot encoded tensor (128, sequence_length, 10.000)?

  2. How large is the second dimension, i.e. the sequence length? Do I use one sentence in each row of the batch, padding the sentences that are shorter than others with zeros?

  3. Or can a row correspond to multiple sentences? E.g. can a row stand for "This is a test sentence. How are"? If so, where does the second sentence continue? In the next row of the same batch? Or in the same row in the next batch? How do I guarantee that TensorFlow continues the sentence correctly?

I wasn't able to find answers to these questions even if they are quite simple. I hope someone can help!

1

1 Answers

0
votes
  1. Yes. It's 3-dimensional vector (128, sequence_length, 10.000)

  2. Yes. you should pad your sentences to make them same length. AND you can use tf.nn.dynamic_rnn and it can handle sentences of variable length base on tf.while. There is great article dealt with this problem. https://danijar.com/variable-sequence-lengths-in-tensorflow/ you can check more detail in Whats the difference between tensorflow dynamic_rnn and rnn?

  3. Possible. but network doesn't know the sentence is connected or not. it just consider one row as one sentence. So, the result will be meaningless.

I hope this answer would help you.