Train-Test split for Time Series Data to be used for LSTM

Question

values = df.values
train, test = train_test_split(values)

#Split into train and test
X_train, y_train = train[:, :-1], train[:, -1]
X_test, y_test = test[:, :-1], test[:, -1]

Executing the above code splits the time series dataset into training- 75% and testing 25%. I want to control the train-test split as 80-20 or 90-10. Can someone please help me understand how to split the dataset into any ratio I want?

The concept is borrowed from https://machinelearningmastery.com/multivariate-time-series-forecasting-lstms-keras/.

Note : I cannot split the dataset randomly for train and test and the most recent values have to be for testing. I have included a screenshot of my dataset.

If anyone can interpret the code, please do help me understand the above. Thanks.

LazyCoder LazyCoder · Accepted Answer · 2020-09-28T19:01:28

Here's the documentation.

Basically, you'll want to do something like train_test_split(values,test_size=.2,shuffle=False)

test_size=.2 tells the function to make the test size 20% of the input data (you can similarly specify trainset size with train_size=n, but in the absence of this specification the function will use 1-test_size, i.e. the complement of the test set).

shuffle=False tells the function not to randomly shuffle the order.

Train-Test split for Time Series Data to be used for LSTM

2 Answers