Keras LSTM model not learning binary sequence

Question

I have a binary sequence of integers {0,1} and I'd like to create an LSTM model to predict the next binary term based on the 3 previous numbers in the sequence.

So, for example, given the train sequence [0,1,1,0,1,0,0,1], for the first 3 numbers [0,1,1] the model should output number 0, for the next sequence [1,1,0] the model should output 1 and for the next sequence [1,0,1] outputs 0 and so on.

To do so, considering the example above I created the following training input set named vecX

array([[[0],
        [1],
        [1]],

       [[1],
        [1],
        [0]],

       [[1],
        [0],
        [1]],

       [[0],
        [1],
        [0]],

       [[1],
        [0],
        [0]]])

and the corresponding training ouput set named vecY:

array([[0],
[1],
[0],
[0],
[1]])

I also created the following keras LSTM network for a bigger training set

LSTM_net = Sequential()
LSTM_net.add(LSTM(1,input_shape=(3,1)))
LSTM_net.add(Dense(1,activation="softmax"))
LSTM_net.compile(optimizer="adagrad", loss="binary_crossentropy",metrics=["accuracy"])
LSTM_net.fit(vecX,vecY,batch_size=256,epochs=100,verbose=2)

When I train this model it gets stuck with constant accuracy during the whole training process

1s - loss: 0.7534 - acc: 0.4992
Epoch 2/1000
0s - loss: 0.7533 - acc: 0.4992
Epoch 3/1000
0s - loss: 0.7534 - acc: 0.4992
Epoch 4/1000
0s - loss: 0.7534 - acc: 0.4992
Epoch 5/1000
0s - loss: 0.7534 - acc: 0.4992

The resulted trained model gives only constant 0 predictions for all the inputs in the train or test set and it seems it has learned anything at all about the sequence.

I tried other activations like softmax, sigmoid and linear but I didn't manage to see any improvement in accuracy. I even tried fitting with shuffle=False parameter but I got the same results.

What am I doing wrong ?

If this question is more about the methods of machine learning than actual programming, maybe the guys at stats.stackexchange.com can help — Sentry

Daniel Möller Daniel Möller · Accepted Answer · 2017-10-07T16:09:26

Softmax is for categorical classification. Many classes where only one is correct.

It will always sum 1. Since you have only one class, your result will always be 1, no matter what.

Use sigmoid instead.

Keras LSTM model not learning binary sequence

1 Answers