5
votes

I am trying to train Keras LSTM model to predict next number in a sequence.

  1. What is wrong with my model below, how do I debug when a model is not learning
  2. How do I decide which layer types to use
  3. On what basis should I select loss and optimizer params while compiling

My input training data is of shape (16000, 10) like below

[
    [14955 14956 14957 14958 14959 14960 14961 14962 14963 14964]
    [14731 14732 14733 14734 14735 14736 14737 14738 14739 14740]
    [35821 35822 35823 35824 35825 35826 35827 35828 35829 35830]
    [12379 12380 12381 12382 12383 12384 12385 12386 12387 12388]
    ...
]

Corresponding output training data is of shape (16000, 1) like below

[[14965] [14741] [35831] [12389] ...]

As LSTM is complaining, I reshaped training/test data

X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)

Here is final training/test data shape

Total Samples: 20000
X_train: (16000, 10, 1)
y_train: (16000, 1)
X_test: (4000, 10, 1)
y_test: (4000, 1)

Here is my model

# Model configuration
epochs = 2
batch_size = 32
hidden_neurons = 100
output_size = 1

# Create the model
model = Sequential()
model.add(LSTM(hidden_neurons, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(Dense(output_size))
model.compile(loss='mean_squared_error', optimizer='rmsprop', metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size)

scores = model.evaluate(X_test, y_test, batch_size=batch_size, verbose=0)
print("Model Accuracy: %.2f%%" % (scores[1]*100))

Here is my output

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_3 (LSTM)                (None, 100)               40800     
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 101       
=================================================================
Total params: 40,901
Trainable params: 40,901
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/2
16000/16000 [==============================] - 11s - loss: 533418575.3600 - acc: 0.0000e+00    
Epoch 2/2
16000/16000 [==============================] - 10s - loss: 532474289.7280 - acc: 6.2500e-05    
Model Accuracy: 0.00%
2
did you try it with more than 2 epochs? - Wilmar van Ommeren
Yes, I tried even 10 epochs, but loss is not decreasing much, accuracy stays 0 - Mosu
This looks like a regression problem, in that case accuracy makes no sense. - Dr. Snoopy
So how to train and evaluate regression problem without accuracy. What should I look at - Mosu
Hello @MatiasValdenegro, Thank you for pointing out improper metrics for regression. I read more on regression metrics here link . Now I am using mse, mae, and the results look more realistic. Thanks. - Mosu

2 Answers

2
votes

try this code:

epochs = 30
batch_size = 64
hidden_neurons = 32
output_size = 1

# Create the model
model = Sequential()
model.add(LSTM(hidden_neurons, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(Dense(output_size, activation = 'elu'))

model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size)

scores = model.evaluate(X_test, y_test, batch_size=batch_size, verbose=0)
print("Model Accuracy: %.2f%%" % (scores[1]*100))

in general, it is really hard to help you, because we need a kind of reproducible example which we can test. However, here are my advises:

play with hyper parameters of your NN, such as: activation functions, opt function, number of layers, learning rate and so on.

UPDATE:

It is highly advisable to normalize your data first.

1
votes

Accuracy is not the right measure for your model's performance. What you are trying to do here is more of a regression task than a classification task. The same can be seen from your loss function, you are using 'mean_squared_error' rather than something like 'categorical_crossentropy'.

Additionally, 50 epochs is too less of a training time. If you take a look at the logs (in your original question), you see the loss reducing with every epoch. You will need to keep training with more epochs till you see that the loss has stabilised and is not reducing further.

Thirdly, you will definitely have to normalize your data before passing it to the fit function. The values are very large and the algorithm might not converge without normalization.

If you still need this problem solved, and need more help, let me know in the comments, so that I can help with the code.