How do I select train data for LSTM network training

Question

I'm basically new to RNNs, but I'm trying to predict signals based on recordings. I have two sets of data A and B - A is the raw data recording, and B is the binary labeled data marking '1' for every active event on A, both with shape (1895700,1)

Could you help me figure out what should be used as x and y train?

I been reading about this and understood to loop through A and extract x and y from here. did thi and got input shape of x_train - (189555, 150, 1) y_train - (189555, 150, 1) but getting accuracy of: 0.0000e+00 and negative loss.

My other approach was using A as x_train and B as y_train with input shapes of (12638,150,1) but from first step of epoch 1, had accuracy of: 96 and around .10 loss. they didnt vary much throughout training

So I'm not really sure what data should be my input

model:

model = Sequential()
model.add(LSTM(128, dropout=0.5, input_shape=(ts,features), recurrent_dropout=0.4, return_sequences=True))
model.add(LSTM(128, dropout=0.5, input_shape=(ts,features), recurrent_dropout=0.3, return_sequences=True))
model.add(LSTM(64, dropout=0.5, input_shape=(ts,features), recurrent_dropout=0.3, return_sequences=True))
model.add(Dense(features, input_shape=(ts, features), activation="sigmoid"))
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

Thanks in advance!

If I understood correctly, then A is your data and B is the label of your data. Don't mix data with labels! You get your x_train and x_test from A and y_train and y_test from B. — Ralvi Isufaj
Ok, so y_train from B, thanks! at the moment I'm not using validation on training, using predictions = model.predict(x_test) instead, after training finished. Also noticed training accuracy decreases a lot when adding more features, why could this be? @RalviIsufaj @AnnaMaule — ca rc

Anna Maule Anna Maule · Accepted Answer · 2020-04-20T22:52:49

Your X_train is the data that represent your features. While Y_train is the data that represents the output for the X_train features.

you can split your data by simply providing a parameter validation_split to the fit function:

model.fit(X_data, Y_data, batch_size=4, epochs=5, verbose=1, validation_split=0.2)

in this case it will split 20% of the data for validation.

How do I select train data for LSTM network training

1 Answers