As (I think) I understand in Keras, LSTM layers expect input data to have 3-dimensions: (batch_size, timesteps, input_dim)
.
However, I'm really struggling to understand what these values actually correspond to when it comes to my data. I'm hoping that if someone can explain how I might go about inputting the following mock data (with a similar structure to my actual dataset) to an LSTM layer, I might then understand how I can achieve this with my real dataset.
So the example data is sequences of categorical data encoded using one-hot-vector encoding. For example, the first 3 samples look like this:
[ [0, 0, 0, 1], [0, 0, 1, 0], [1, 0, 0, 0], [0, 0, 1, 0], [1, 0, 0, 0] ]
[ [0, 1, 0, 0], [0, 1, 0, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 0, 0] ]
[ [0, 0, 0, 0], [0, 0, 1, 0], [1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1] ]
i.e. The sequences are of length 5, with 4 categorical options that can be in a position within the sequence. Let's also say I have 3000 sequences. It's a binary classification problem.
So I believe this would make the shape of my dataset (3000, 5, 4)
?
The model I want to use looks like this:
model = keras.Sequential([
keras.layers.LSTM(units=3, batch_input_shape=(???)),
keras.layers.Dense(128, activation='tanh'),
keras.layers.Dense(64, activation='tanh'),
keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=20)
This ignores any training/testing split for now, so just assume I'm training with the entire dataset. The part I'm struggling with is input_shape
.
I want each element within the sequence to be a timestep. I've tried lots of different shapes and got lots of different errors. I'm guessing I actually need to reshape x_train
instead of just adjusting input_shape
. The problem is I have no idea what shape it actually needs to be.
I think I understand the theory behind LSTM, it's just the practicalities of the dimensionality requirements that I'm struggling to get my head around.
Any help or advice would be massively appreciated. Thank you.
EDIT - As suggested by @scign. Here is an example of an error I'm getting using the following code for the mock dataset:
x_train = [[0, 0, 0, 1], [0, 0, 1, 0], [
1, 0, 0, 0], [0, 0, 1, 0], [1, 0, 0, 0]], [[0, 1, 0, 0], [0, 1, 0, 0], [1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 0, 0]], [[0, 0, 0, 0], [0, 0, 1, 0], [1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1]]
y_train = [1, 0, 1]
model = keras.Sequential([
keras.layers.LSTM(units=3, batch_input_shape=(1, 5, 4)),
keras.layers.Dense(128, activation='tanh'),
keras.layers.Dense(64, activation='tanh'),
keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=20)
Error - ValueError: Error when checking input: expected lstm_input to have 3 dimensions, but got array with shape (5, 4)
batch_input_shape
should be given as(batch_size, timesteps, data_dim)
keras.io/getting-started/sequential-model-guide for some examples – alec_djinn[]
around it I'd say? – UninformedUser