Understanding Keras LSTM NN input & output for binary classification

Question

I am trying to create a simple LSTM network that would - based on the last 16 time frames - provide some output. Let's say I have a dataset with 112000 rows (measurements) and 7 columns (6 features + class). What I understand is that I have to "pack" the dataset into X number of 16 elements long batches. With 112000 rows that would mean 112000/16 = 7000 batches, therefore a numpy 3D array with shape (7000, 16, 7). Splitting this array for train and test data I get shapes:

xtrain.shape == (5000, 16, 6)
ytrain.shape == (5000, 16)
xtest.shape == (2000, 16, 6)
ytest.shape == (2000, 16)

My model looks like this:

model.add(keras.layers.LSTM(8, input_shape=(16, 6), stateful=True, batch_size=16, name="input"));
model.add(keras.layers.Dense(5, activation="relu", name="hidden1"));
model.add(keras.layers.Dense(1, activation="sigmoid", name="output"));
model.compile(optimizer="rmsprop", loss="binary_crossentropy", metrics=["accuracy"]);

model.fit(xtrain, ytrain, batch_size=16, epochs=10);

However after trying to fit the model I get this error:

ValueError: Error when checking target: expected output to have shape (1,) but got array with shape (16,)

What I guess is wrong is that the model expects a single output per batch (so the ytrain shape should be (5000,)), instead of 16 outputs (one for every entry in a batch - (5000, 16)).

If that is the case, should I, instead of packing the data like this, create a 16 elements long batch for every output? Therefore having

xtrain.shape == (80000, 16, 6)
ytrain.shape == (80000,)
xtest.shape == (32000, 16, 6)
ytest.shape == (32000,)

dataista dataista · Accepted Answer · 2018-11-22T23:56:36

You are close with the last comments of the question. Since it's a binary classification problem, you should have 1 output per input, so you need to get rid of the 16 in you ys and replace it for a 1.

Besides, you need to be able to divide the train set by your batch size, so you can use 5008 for example.

In fact:

ytrain.shape == (5000, 1)

Passes the error you mention, but raises a new one:

ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 5000 samples

Which is addressed by ensuring that:

xtrain.shape == (5008, 16, 6)
ytrain.shape == (5008, 1)

Understanding Keras LSTM NN input & output for binary classification

1 Answers