0
votes

I'm using this formula to predict stock price in Jupyter:

import keys
import datetime
from binance.client import Client
import pandas as pd

client = Client(keys.APIKey, keys.SecretKey)

symbol= 'BTCUSDT'
BTC= client.get_historical_klines(symbol=symbol, interval=Client.KLINE_INTERVAL_30MINUTE, start_str="1 year ago UTC")

%matplotlib inline

BTC= pd.DataFrame(BTC, columns=['Open time', 'Open', 'High', 'Low', 
                                'Close', 'Volume', 'Close time', 
                                'Quote asset volume','Number of trades',
                                'Taker buy base asset volume', 
                                'Taker buy quote asset volume','Ignore'])

BTC['Open time'] = pd.to_datetime(BTC['Open time'], unit='ms')

BTC.set_index('Open time', inplace=True)
BTC

data= BTC.iloc[:,3:4].astype(float).values
from sklearn.preprocessing import MinMaxScaler

scaler= MinMaxScaler()
data=scaler.fit_transform(data)
training_set= data[:10000]
test_set=data[10000:]

X_train= training_set[0:len(training_set)-1]
y_train= training_set[1:len(training_set)]
X_test= test_set[0:len(test_set)-1]
y_test= test_set[1:len(test_set)]

import numpy as np
X_train = np.reshape(X_train, (len(X_train), 1, X_train.shape[1]))
x_test = np.reshape(X_test, (len(X_test), 1, X_test.shape[1]))

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM

model = Sequential()
model.add(LSTM(256, return_sequences=True, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(LSTM(256))
model.add(Dense(1))

model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=50, batch_size=16, shuffle=False)

predicted_price= model.predict(X_test)
predicted_price= scaler.inverse_transform(predicted_price)
real_price = scaler.inverse_transform(y_test)

But, instead of getting the Real vs Predicted Graph, I get this error:


ValueError Traceback (most recent call last) in ----> 1 predicted_price= model.predict([X_test]) 2 predicted_price= scaler.inverse_transform(predicted_price) 3 real_price = scaler.inverse_transform(y_test)

E:\anaconda3\lib\site-packages\keras\engine\training.py in predict(self, x, batch_size, verbose, steps, callbacks, max_queue_size, workers, use_multiprocessing) 1439 1440

Case 2: Symbolic tensors or Numpy array-like. -> 1441 x, _, _ = self._standardize_user_data(x) 1442 if self.stateful: 1443 if x[0].shape[0] > batch_size and x[0].shape[0] % batch_size != 0:

E:\anaconda3\lib\site-packages\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size) 577 feed_input_shapes, 578 check_batch_axis=False, # Don't enforce the batch size. --> 579 exception_prefix='input') 580 581 if y is not None:

E:\anaconda3\lib\site-packages\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix) 133 ': expected ' + names[i] + ' to have ' + 134 str(len(shape)) + ' dimensions, but got array ' --> 135 'with shape ' + str(data_shape)) 136 if not check_batch_axis: 137 data_shape = data_shape[1:]

ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (7505, 1)

Even with this log, I can't point to root cause to fix it.

1

1 Answers

1
votes

LSTM models expect input dim = 3: (#samples, timestamp, features).
So for example, if you have 7505 audio files, each has 100 timestamp, each timestamp has 578 features - the train set should be with the current shape: (3,100,578).
Your input shape is (#samples, features), so reshape it to be 3 dimensional - For X_train and X_train.

x_train = np.reshape(X_train, (len(X_train), 1, X_train.shape[1]))
x_test = np.reshape(X_test, (len(X_test), 1, X_test.shape[1]))

Also, make sure that you call fit and predict with the reshaped data.

model.fit(x_train, y_train, epochs=50, batch_size=16, shuffle=False)
#.....
predicted_price= model.predict(x_test) #<--- Your problem was here! Make sure you use x_test and not X_test.