I am new to neural networks and have two, probably pretty basic, questions. I am setting up a generic LSTM Network to predict the future of sequence, based on multiple Features. My training data is therefore of the shape (number of training sequences, length of each sequence, amount of features for each timestep). Or to make it more specific, something like (2000, 10, 3). I try to predict the value of one feature, not of all three.
- Problem:
If I make my Network deeper and/or wider, the only output I get is the constant mean of the values to be predicted. Take this setup for example:
z0 = Input(shape=[None, len(dataset[0])])
z = LSTM(32, return_sequences=True, activation='softsign', recurrent_activation='softsign')(z0)
z = LSTM(32, return_sequences=True, activation='softsign', recurrent_activation='softsign')(z)
z = LSTM(64, return_sequences=True, activation='softsign', recurrent_activation='softsign')(z)
z = LSTM(64, return_sequences=True, activation='softsign', recurrent_activation='softsign')(z)
z = LSTM(128, activation='softsign', recurrent_activation='softsign')(z)
z = Dense(1)(z)
model = Model(inputs=z0, outputs=z)
print(model.summary())
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
history= model.fit(trainX, trainY,validation_split=0.1, epochs=200, batch_size=32,
callbacks=[ReduceLROnPlateau(factor=0.67, patience=3, verbose=1, min_lr=1E-5),
EarlyStopping(patience=50, verbose=1)])
If I just use one layer, like:
z0 = Input(shape=[None, len(dataset[0])])
z = LSTM(4, activation='soft sign', recurrent_activation='softsign')(z0)
z = Dense(1)(z)
model = Model(inputs=z0, outputs=z)
print(model.summary())
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
history= model.fit(trainX, trainY,validation_split=0.1, epochs=200, batch_size=32,
callbacks=[ReduceLROnPlateau(factor=0.67, patience=3, verbose=1, min_lr=1E-5),
EarlyStopping(patience=200, verbose=1)])
The predictions are somewhat reasonable, at least they are not constant anymore.
Why does that happen? Around 2000 samples not that many, but in the case of overfitting, I would expect the predictions to match perfectly...
- EDIT: Solved, as stated in the comments, it's just that Keras always expects Batches: Keras
When I use:
`test=model.predict(trainX[0])`
to get the prediction for the first sequence, I get an dimension error:
"Error when checking : expected input_1 to have 3 dimensions, but got array with shape (3, 3)"
I need to feed in an array of sequences like:
`test=model.predict(trainX[0:1])`
This is a workaround, but I am not really sure, whether this has any deeper meaning, or is just a syntax thing...