1
votes

I have a binary sequence of integers {0,1} and I'd like to create an LSTM model to predict the next binary term based on the 3 previous numbers in the sequence.

So, for example, given the train sequence [0,1,1,0,1,0,0,1], for the first 3 numbers [0,1,1] the model should output number 0, for the next sequence [1,1,0] the model should output 1 and for the next sequence [1,0,1] outputs 0 and so on.

To do so, considering the example above I created the following training input set named vecX

array([[[0],
        [1],
        [1]],

       [[1],
        [1],
        [0]],

       [[1],
        [0],
        [1]],

       [[0],
        [1],
        [0]],

       [[1],
        [0],
        [0]]])

and the corresponding training ouput set named vecY:

array([[0],
[1],
[0],
[0],
[1]])

I also created the following keras LSTM network for a bigger training set

LSTM_net = Sequential()
LSTM_net.add(LSTM(1,input_shape=(3,1)))
LSTM_net.add(Dense(1,activation="softmax"))
LSTM_net.compile(optimizer="adagrad", loss="binary_crossentropy",metrics=["accuracy"])
LSTM_net.fit(vecX,vecY,batch_size=256,epochs=100,verbose=2)

When I train this model it gets stuck with constant accuracy during the whole training process

1s - loss: 0.7534 - acc: 0.4992
Epoch 2/1000
0s - loss: 0.7533 - acc: 0.4992
Epoch 3/1000
0s - loss: 0.7534 - acc: 0.4992
Epoch 4/1000
0s - loss: 0.7534 - acc: 0.4992
Epoch 5/1000
0s - loss: 0.7534 - acc: 0.4992

The resulted trained model gives only constant 0 predictions for all the inputs in the train or test set and it seems it has learned anything at all about the sequence.

I tried other activations like softmax, sigmoid and linear but I didn't manage to see any improvement in accuracy. I even tried fitting with shuffle=False parameter but I got the same results.

What am I doing wrong ?

1
If this question is more about the methods of machine learning than actual programming, maybe the guys at stats.stackexchange.com can helpSentry
did you use relu or one of its variants ?Hari Krishnan

1 Answers

1
votes

Softmax is for categorical classification. Many classes where only one is correct.

It will always sum 1. Since you have only one class, your result will always be 1, no matter what.

Use sigmoid instead.