LSTM optimization with Keras

Question

Update: Thank Catalina for taking the time

I did what you suggested 1- I split the data into train and validation and add it to the fit

history = model_final.fit(x_train_multi,
                      y_train_multi,
                      batch_size=BATCH_SIZE,
                      epochs=EPOCHS,
                      validation_data=(x_val_multi, y_val_multi),
                      callbacks=[tensorboard_callback,model_checkpoint_callback]
                      )

I also created the checkpoint and added to the callbacks

checkpoint_filepath = 'checkpoint_model/'
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_filepath,
    save_weights_only=True,
    monitor='mse',
    mode='min',
    save_best_only=True)

3- I tried also adding dropouts, adding more LSTM layers and increasing the number of units per layer

    ________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_50 (LSTM)               (None, 128)               67072     
_________________________________________________________________
activation_19 (Activation)   (None, 128)               0         
_________________________________________________________________
dense_48 (Dense)             (None, 256)               33024     
_________________________________________________________________
dense_49 (Dense)             (None, 7)                 1799      
=================================================================
val_mse: 0.036


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_51 (LSTM)               (None, 128)               67072     
_________________________________________________________________
batch_normalization_11 (Batc (None, 128)               512       
_________________________________________________________________
activation_20 (Activation)   (None, 128)               0         
_________________________________________________________________
dense_50 (Dense)             (None, 256)               33024     
_________________________________________________________________
dense_51 (Dense)             (None, 7)                 1799      
=================================================================
val_mse: 0.91


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_54 (LSTM)               (None, 128)               67072     
_________________________________________________________________
activation_23 (Activation)   (None, 128)               0         
_________________________________________________________________
dropout_17 (Dropout)         (None, 128)               0         
_________________________________________________________________
dense_54 (Dense)             (None, 256)               33024     
_________________________________________________________________
dropout_18 (Dropout)         (None, 256)               0         
_________________________________________________________________
dense_55 (Dense)             (None, 7)                 1799      
=================================================================
val_mse: 0.0281


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_55 (LSTM)               (None, 30, 128)           67072     
_________________________________________________________________
activation_24 (Activation)   (None, 30, 128)           0         
_________________________________________________________________
lstm_56 (LSTM)               (None, 128)               131584    
_________________________________________________________________
activation_25 (Activation)   (None, 128)               0         
_________________________________________________________________
dense_56 (Dense)             (None, 256)               33024     
_________________________________________________________________
dense_57 (Dense)             (None, 7)                 1799      
=================================================================
val_mse: 0.0822


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_57 (LSTM)               (None, 30, 128)           67072     
_________________________________________________________________
activation_26 (Activation)   (None, 30, 128)           0         
_________________________________________________________________
lstm_58 (LSTM)               (None, 128)               131584    
_________________________________________________________________
activation_27 (Activation)   (None, 128)               0         
_________________________________________________________________
dense_58 (Dense)             (None, 256)               33024     
_________________________________________________________________
dense_59 (Dense)             (None, 256)               65792     
_________________________________________________________________
dense_60 (Dense)             (None, 7)                 1799      
=================================================================
val_mse: 0.0541

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_61 (LSTM)               (None, 30, 128)           67072     
_________________________________________________________________
activation_30 (Activation)   (None, 30, 128)           0         
_________________________________________________________________
lstm_62 (LSTM)               (None, 128)               131584    
_________________________________________________________________
activation_31 (Activation)   (None, 128)               0         
_________________________________________________________________
dense_62 (Dense)             (None, 7)                 903       
=================================================================
val_mse: 0.0713

4- I also tried with adding the BatchNormalization

Nothing seems to help getting an accurate prediction of the next 5 days.

Any other thing that you think i can try?

Thanks!!

I started playing with keras in python some weeks ago and now i am trying to solve a prediction problem. Basically I have the active students and how many students assist to classes each day. What I want to do is to predict the assistance of the following X days, where x for now 10 days. For that I have the info from the 1st of jan to the 28th of sep What I am doing is grouping the number of students and number of students that assist in a day in batches of 30 days and i am feeding the RNN with that and getting as the output an array with 10 values

This is my model Model: "sequential_1"

Layer (type) Output Shape Param #

lstm_2 (LSTM) (None, 256) 265216

dense_1 (Dense) (None, 128) 32896

dense_2 (Dense) (None, 10) 1290

Total params: 299,402 Trainable params: 299,402 Non-trainable params: 0

I am trying to improve my model to get as precise predictions as possible but i am having a hard time with that.

I would appreciate your comments and recommendations

Here is my google colab notebook: https://colab.research.google.com/drive/15RhjFwzjpUAhdXFWkn1dmRgFLoDQcuiU?usp=sharing

And dataset: https://drive.google.com/file/d/1-HhaXSs4Bf0a0626bhAUCobFZRTasOGM/view?usp=sharing

Thanks in advance

After a quick look on your data it doesn't seem to be sequential. The main function of RNN is to figure out the dependency of a node on its neighbors in a temporal sequence. Is the number of student assist today dependent on the yesterday's number? or is more dependent on the features like day of the week (whether its Sunday or Monday) or maybe holiday season etc. Seems to me like a regression problem. Have your thought about using regression model on your data rather than RNN? — mb0850
I checked your data, it seems to me that you have not enough sampels for a LSTM. @mb0850 is right, you should try regression. I reccomend gradient boosting algorithms, they give good results. — Catalina Chircu

Catalina Chircu Catalina Chircu · Accepted Answer · 2020-09-30T22:16:16

Here are some hints :

Split your training set into train/validation and add a validation set to youur model (you add it as a tuple argument to the fit function).
Add a checkpoint and (callbacks, to add as an argument to the fit function), which will save the best model, and use it afterwards when making the prediction. Check the Tensorflow/Keras documentation for more information.
Test different values for Dropout and the hidden sizes.
Add a BatchNormalization layer.

Try all these and see which provides the best result.

LSTM optimization with Keras

1 Answers