I want to build a recurrent neural network (RNN) in TensorFlow that predicts the next word in a sequence of words. I have looked at several tutorials, e.g. the one of TensorFlow. I know that each word in the training text(s) is mapped to an integer index. However there are still a few things about the input that I don't get:
Networks are trained with batches, e.g. with 128 examples at the same time. Let's say we have 10.000 words in our vocabulary. Is the input to the network a matrix of size (128, sequence_length) or a one-hot encoded tensor (128, sequence_length, 10.000)?
How large is the second dimension, i.e. the sequence length? Do I use one sentence in each row of the batch, padding the sentences that are shorter than others with zeros?
Or can a row correspond to multiple sentences? E.g. can a row stand for "This is a test sentence. How are"? If so, where does the second sentence continue? In the next row of the same batch? Or in the same row in the next batch? How do I guarantee that TensorFlow continues the sentence correctly?
I wasn't able to find answers to these questions even if they are quite simple. I hope someone can help!