Input shape for LSTM which has one hot encoded data

Question

The sample dataset contains Location point of the user.

df.head()

   user           tslot         Location_point
0   0   2015-12-04 13:00:00     4356
1   0   2015-12-04 13:15:00     4356
2   0   2015-12-04 13:30:00     3659
3   0   2015-12-04 13:45:00     4356
4   0   2015-12-04 14:00:00     8563

df.shape 

(288,3)

As the location points are categorical values they are one hot encoded.

encoded = to_categorical(df['Location_point'])

The encoded values are as below

[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]

Shape of the encoded values is (288,8564).

I tried to shape the training data

X_trai = []
y_trai = []
for i in range(96, 288):
    X_trai.append(encoded[i-96:i])
    y_trai.append(encoded[i])
X_trai, y_trai = np.array(X_trai), np.array(y_trai)

X_trai = np.reshape(X_trai, (X_trai.shape[0], X_trai.shape[1], 1))

And the model is

regressor = Sequential()

regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_trai.shape[1], 1)))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))

regressor.add(Dense(units = 1))

regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')

regressor.fit(X_trai, y_trai, epochs = 100, batch_size = 32)

This is not the correct model. I am new to deep learning. I tried to see some example but not able to understand for one hot encoding. I would be grateful if someone can explain the input shape, output shape, and the correct model.

The input is the sequence of the location points and the output is to predict
 the next location point for that user.

Also, can you please post the error. I think it requires 3 Dimensional shape of the input. — Ashwin Geet D'Sa
yeah, but I don't get how to convert my data to 3 dimensional. can you suggest my input based on my data, — Krush23
ValueError: Error when checking input: expected dense_3_input to have 2 dimensions, but got array with shape (1, 288, 8654). The error is regarding the shape of the input. — Krush23

kevin kevin · Accepted Answer · 2019-06-20T22:01:04

The input shape depends on your data, if you have a single sample with 288 timesteps and 8564 features, your input shape will be (batch_size=1, timesteps=288, n_features=8564), if you have 288 samples of a single timestep it wold be (batch_size=288, timesteps=1, n_features=8564).

Anyway, here you have a tutorial about how to prepare your data for LSTM models. https://machinelearningmastery.com/reshape-input-data-long-short-term-memory-networks-keras/ https://machinelearningmastery.com/how-to-one-hot-encode-sequence-data-in-python/

The input shape for the LSTM is the following:

3D tensor with shape (batch_size, timesteps, input_dim), (Optional) 2D tensors with shape (batch_size, output_dim).

Timesteps will be your time-series sequences length and input_dim the number of features you have, in this case since they are one hot encoded it would be 8564.

The output shape will depend on the architecture of your model.

The first layer gives you an output of (batch_size, timesteps, units)
The sencond layer: (batch_size, timesteps, units)
The third layer:(batch_size, units)
The last layer: (batch_size, 1)

Nevertheless, you can check your model input/output shapes with:

regressor.input_shape & regressor.output_shape

Lastly, why don't you consider your Location_point as a numeric variable?

Input shape for LSTM which has one hot encoded data

1 Answers