I am trying to build a predictive model on stock prices. From what I've read, LSTM is a good layer to use. I can't fully understand what my input_shape
needs to be for my model though.
Here is the tail
of my DataFrame
I then split the data into train / test
labels = df['close'].values
x_train_df = df.drop(columns=['close'])
x_train, x_test, y_train, y_test = train_test_split(x_train_df.values, labels, test_size=0.2, shuffle=False)
min_max_scaler = MinMaxScaler()
x_train = min_max_scaler.fit_transform(x_train)
x_test = min_max_scaler.transform(x_test)
print('y_train', y_train.shape)
print('y_test', y_test.shape)
print('x_train', x_train.shape)
print('x_test', x_test.shape)
print(x_train)
This yields:
Here's where I am getting confused. Running the simple example, I get the following error:
ValueError: Input 0 of layer lstm_15 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1, 4026, 5]
I've tried various combinations of messing with the input_shape
and have came to the conclusion, I have no idea how to determine the input shape.
model = Sequential()
model.add(LSTM(32, input_shape=(1, x_train.shape[0], x_train.shape[1])))
model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
Given my dataframe, what should be my input_shape
? I understand that the input shape is batch size
, timesteps
, data dim
. Just not clear how to map those words to my actual data as what I've thought the values were, are actually not.
I was thinking:
- Batch Size: Number of records I'm passing in (4026)
- Time Steps: 1 (I'm not sure if this is supposed to be the same value as batch size?)
- Data Dimension: 1 since my data is 1 dimensional (I think?)