I have a binary sequence of integers {0,1} and I'd like to create an LSTM model to predict the next binary term based on the 3 previous numbers in the sequence.
So, for example, given the train sequence [0,1,1,0,1,0,0,1], for the first 3 numbers [0,1,1] the model should output number 0, for the next sequence [1,1,0] the model should output 1 and for the next sequence [1,0,1] outputs 0 and so on.
To do so, considering the example above I created the following training input set named vecX
array([[[0],
[1],
[1]],
[[1],
[1],
[0]],
[[1],
[0],
[1]],
[[0],
[1],
[0]],
[[1],
[0],
[0]]])
and the corresponding training ouput set named vecY:
array([[0],
[1],
[0],
[0],
[1]])
I also created the following keras LSTM network for a bigger training set
LSTM_net = Sequential()
LSTM_net.add(LSTM(1,input_shape=(3,1)))
LSTM_net.add(Dense(1,activation="softmax"))
LSTM_net.compile(optimizer="adagrad", loss="binary_crossentropy",metrics=["accuracy"])
LSTM_net.fit(vecX,vecY,batch_size=256,epochs=100,verbose=2)
When I train this model it gets stuck with constant accuracy during the whole training process
1s - loss: 0.7534 - acc: 0.4992
Epoch 2/1000
0s - loss: 0.7533 - acc: 0.4992
Epoch 3/1000
0s - loss: 0.7534 - acc: 0.4992
Epoch 4/1000
0s - loss: 0.7534 - acc: 0.4992
Epoch 5/1000
0s - loss: 0.7534 - acc: 0.4992
The resulted trained model gives only constant 0 predictions for all the inputs in the train or test set and it seems it has learned anything at all about the sequence.
I tried other activations like softmax, sigmoid and linear but I didn't manage to see any improvement in accuracy. I even tried fitting with shuffle=False parameter but I got the same results.
What am I doing wrong ?