I am not understanding how LSTM layers are fed with data.
LSTM layers requires three dimensions (x,y,z)
.
I do have a dataset of time series: 2900 rows in total, which should conceptually divided into groups of 23 consecutive rows where each row is described by 178 features. Conceptually every 23 rows I have a new sequence 23 rows long regarding a new patient.
Are the following statements right?
x
samples = # of bunches of sequences 23 rows long - namelylen(dataframe)/23
y
time steps = length of the each sequence - by domain assumption 23 here.z
feature size = # of columns for each row - 178 in this case.
Therefore x*y = "# of rows in the dataset"
Assuming this is correct, what's a batch size while training a model in this case?
Might be the number of samples considered in an epoch while training?
Therefore by having x
(# of samples) equal to 200, it makes no sense to set a batch_size
greater than 200, because that's my upper limit - I don't have more data to train on.