I am working on a character level text generator using Keras. In going through examples/tutorials there is something that I still do not understand.
The training data (X) is being split into semi redundant sequences of length maxlen, with y being the character immediately following the sequence.
I understand that this is for efficiency as it means that the training will only realize dependencies within maxlen characters.
I am struggling to understand why it is done in sequences though. I thought LSTM/RNN were trained by inputting characters one at a time and comparing the predicted next character to the actual next character. This seems very different then inputting them say maxlen=50 characters at a time and comparing length 50 sequences to the next character.
Does Keras actually break up the training sequences and input them character by character "under the hood"?
If not why?