1
votes

I am training a LSTM model with a trained Word2Vec and after 3 epochs i started to observe that my training loss starts to increase while validation loss still keeps decreasing. And it is the same case for the accuracy. Training accuracy starts to decrease and validation accuracy keeps increasing. Here are the figures for the comparison and also my model parameters.

My learning rate is default set which is 0.001 and i cannot decide if i should keep training or cut the training when training loss starts to increase.

Thanks in advance.

enter image description here

enter image description here

model = Sequential()
#model.add(Embedding(maximum_words_number, e_dim, input_length=X.shape[1]))
model.add(Embedding(58137, 100, weights = [embeddings] ,input_length=X_train.shape[1],trainable = False)) # -> This adds Word2Vec encodings
model.add(LSTM(10,return_sequences= True, dropout=0.2, recurrent_dropout=0.2))
model.add(LSTM(10,return_sequences= False, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))
#opt = SGD(lr=0.05)
model.compile(loss='binary_crossentropy', optimizer="Nadam", metrics=['accuracy'])
epochs = 4
batch_size = 100
model_outcome = model.fit(X_train, Y_train, epochs=epochs, batch_size=batch_size,validation_split=0.2,callbacks=[EarlyStopping(monitor='val_loss', patience=1, min_delta=0.0001)])
Train on 3931 samples, validate on 983 samples
Epoch 1/4
3931/3931 [==============================] - ETA: 2:56:26 - loss: 0.6879 - accuracy: 0.580 - ETA: 2:46:13 - loss: 0.6891 - accuracy: 0.530 - ETA: 2:34:51 - loss: 0.6845 - accuracy: 0.556 - ETA: 2:26:49 - loss: 0.6820 - accuracy: 0.570 - ETA: 2:21:09 - loss: 0.6846 - accuracy: 0.550 - ETA: 2:15:42 - loss: 0.6823 - accuracy: 0.573 - ETA: 2:10:58 - loss: 0.6822 - accuracy: 0.565 - ETA: 2:06:02 - loss: 0.6854 - accuracy: 0.547 - ETA: 2:01:00 - loss: 0.6850 - accuracy: 0.558 - ETA: 1:56:26 - loss: 0.6833 - accuracy: 0.563 - ETA: 1:53:31 - loss: 0.6820 - accuracy: 0.573 - ETA: 1:50:38 - loss: 0.6817 - accuracy: 0.574 - ETA: 1:47:40 - loss: 0.6815 - accuracy: 0.573 - ETA: 1:44:31 - loss: 0.6802 - accuracy: 0.582 - ETA: 1:41:28 - loss: 0.6782 - accuracy: 0.585 - ETA: 1:38:09 - loss: 0.6779 - accuracy: 0.581 - ETA: 1:34:40 - loss: 0.6769 - accuracy: 0.580 - ETA: 1:30:53 - loss: 0.6768 - accuracy: 0.580 - ETA: 1:26:56 - loss: 0.6754 - accuracy: 0.584 - ETA: 1:22:56 - loss: 0.6739 - accuracy: 0.587 - ETA: 1:18:52 - loss: 0.6723 - accuracy: 0.590 - ETA: 1:14:51 - loss: 0.6703 - accuracy: 0.592 - ETA: 1:10:43 - loss: 0.6680 - accuracy: 0.597 - ETA: 1:06:38 - loss: 0.6648 - accuracy: 0.606 - ETA: 1:02:26 - loss: 0.6616 - accuracy: 0.611 - ETA: 58:10 - loss: 0.6594 - accuracy: 0.6142  - ETA: 53:56 - loss: 0.6580 - accuracy: 0.615 - ETA: 49:37 - loss: 0.6572 - accuracy: 0.616 - ETA: 45:18 - loss: 0.6553 - accuracy: 0.618 - ETA: 40:57 - loss: 0.6545 - accuracy: 0.619 - ETA: 36:36 - loss: 0.6527 - accuracy: 0.622 - ETA: 32:15 - loss: 0.6493 - accuracy: 0.626 - ETA: 27:52 - loss: 0.6478 - accuracy: 0.628 - ETA: 23:29 - loss: 0.6455 - accuracy: 0.630 - ETA: 19:06 - loss: 0.6424 - accuracy: 0.634 - ETA: 14:41 - loss: 0.6396 - accuracy: 0.637 - ETA: 10:15 - loss: 0.6378 - accuracy: 0.640 - ETA: 5:49 - loss: 0.6354 - accuracy: 0.643 - ETA: 1:22 - loss: 0.6335 - accuracy: 0.64 - 10937s 3s/step - loss: 0.6331 - accuracy: 0.6459 - val_loss: 0.5066 - val_accuracy: 0.7792
Epoch 2/4
3931/3931 [==============================] - ETA: 3:03:31 - loss: 0.6418 - accuracy: 0.660 - ETA: 2:57:39 - loss: 0.5754 - accuracy: 0.710 - ETA: 2:50:26 - loss: 0.5706 - accuracy: 0.716 - ETA: 2:55:55 - loss: 0.5607 - accuracy: 0.720 - ETA: 2:55:39 - loss: 0.5552 - accuracy: 0.718 - ETA: 2:55:12 - loss: 0.5473 - accuracy: 0.731 - ETA: 2:52:50 - loss: 0.5440 - accuracy: 0.737 - ETA: 2:49:19 - loss: 0.5391 - accuracy: 0.740 - ETA: 2:45:24 - loss: 0.5380 - accuracy: 0.740 - ETA: 2:41:00 - loss: 0.5361 - accuracy: 0.740 - ETA: 2:36:48 - loss: 0.5414 - accuracy: 0.734 - ETA: 2:32:57 - loss: 0.5357 - accuracy: 0.738 - ETA: 2:28:34 - loss: 0.5292 - accuracy: 0.743 - ETA: 2:24:22 - loss: 0.5240 - accuracy: 0.747 - ETA: 2:19:52 - loss: 0.5230 - accuracy: 0.750 - ETA: 2:14:57 - loss: 0.5157 - accuracy: 0.757 - ETA: 2:09:42 - loss: 0.5118 - accuracy: 0.761 - ETA: 2:04:24 - loss: 0.5154 - accuracy: 0.758 - ETA: 1:59:06 - loss: 0.5126 - accuracy: 0.760 - ETA: 1:53:46 - loss: 0.5107 - accuracy: 0.760 - ETA: 1:48:16 - loss: 0.5062 - accuracy: 0.763 - ETA: 1:42:45 - loss: 0.5032 - accuracy: 0.766 - ETA: 1:37:09 - loss: 0.5041 - accuracy: 0.767 - ETA: 1:31:22 - loss: 0.5045 - accuracy: 0.766 - ETA: 1:25:30 - loss: 0.5072 - accuracy: 0.764 - ETA: 1:19:45 - loss: 0.5071 - accuracy: 0.764 - ETA: 1:13:57 - loss: 0.5094 - accuracy: 0.763 - ETA: 1:08:07 - loss: 0.5124 - accuracy: 0.763 - ETA: 1:02:15 - loss: 0.5103 - accuracy: 0.764 - ETA: 56:19 - loss: 0.5101 - accuracy: 0.7630  - ETA: 50:20 - loss: 0.5058 - accuracy: 0.766 - ETA: 44:19 - loss: 0.5052 - accuracy: 0.767 - ETA: 38:19 - loss: 0.5063 - accuracy: 0.766 - ETA: 32:18 - loss: 0.5037 - accuracy: 0.768 - ETA: 26:15 - loss: 0.5041 - accuracy: 0.768 - ETA: 20:11 - loss: 0.5054 - accuracy: 0.766 - ETA: 14:06 - loss: 0.5068 - accuracy: 0.765 - ETA: 8:00 - loss: 0.5024 - accuracy: 0.769 - ETA: 1:53 - loss: 0.5026 - accuracy: 0.76 - 14951s 4s/step - loss: 0.5024 - accuracy: 0.7698 - val_loss: 0.4381 - val_accuracy: 0.8006
Epoch 3/4
3931/3931 [==============================] - ETA: 4:10:44 - loss: 0.5040 - accuracy: 0.750 - ETA: 3:44:47 - loss: 0.4679 - accuracy: 0.780 - ETA: 3:34:11 - loss: 0.4734 - accuracy: 0.780 - ETA: 3:26:02 - loss: 0.4729 - accuracy: 0.785 - ETA: 3:16:47 - loss: 0.4638 - accuracy: 0.784 - ETA: 3:07:57 - loss: 0.4527 - accuracy: 0.796 - ETA: 3:01:40 - loss: 0.4502 - accuracy: 0.800 - ETA: 2:56:22 - loss: 0.4458 - accuracy: 0.803 - ETA: 2:50:30 - loss: 0.4472 - accuracy: 0.801 - ETA: 2:43:48 - loss: 0.4488 - accuracy: 0.797 - ETA: 2:37:21 - loss: 0.4466 - accuracy: 0.802 - ETA: 2:31:07 - loss: 0.4468 - accuracy: 0.803 - ETA: 2:24:57 - loss: 0.4453 - accuracy: 0.806 - ETA: 2:20:04 - loss: 0.4439 - accuracy: 0.810 - ETA: 2:14:58 - loss: 0.4447 - accuracy: 0.811 - ETA: 2:09:36 - loss: 0.4401 - accuracy: 0.814 - ETA: 2:03:28 - loss: 0.4381 - accuracy: 0.816 - ETA: 1:57:37 - loss: 0.4413 - accuracy: 0.813 - ETA: 1:51:48 - loss: 0.4410 - accuracy: 0.814 - ETA: 1:45:59 - loss: 0.4432 - accuracy: 0.812 - ETA: 1:40:19 - loss: 0.4404 - accuracy: 0.814 - ETA: 1:34:33 - loss: 0.4363 - accuracy: 0.817 - ETA: 1:28:51 - loss: 0.4360 - accuracy: 0.817 - ETA: 1:23:12 - loss: 0.4363 - accuracy: 0.816 - ETA: 1:17:37 - loss: 0.4371 - accuracy: 0.816 - ETA: 1:12:05 - loss: 0.4403 - accuracy: 0.817 - ETA: 1:06:31 - loss: 0.4411 - accuracy: 0.816 - ETA: 1:01:01 - loss: 0.4389 - accuracy: 0.817 - ETA: 55:32 - loss: 0.4387 - accuracy: 0.8176  - ETA: 50:05 - loss: 0.4385 - accuracy: 0.817 - ETA: 44:38 - loss: 0.4381 - accuracy: 0.818 - ETA: 39:13 - loss: 0.4329 - accuracy: 0.821 - ETA: 33:48 - loss: 0.4352 - accuracy: 0.819 - ETA: 28:25 - loss: 0.4331 - accuracy: 0.821 - ETA: 23:02 - loss: 0.4344 - accuracy: 0.820 - ETA: 17:40 - loss: 0.4377 - accuracy: 0.818 - ETA: 12:19 - loss: 0.4355 - accuracy: 0.820 - ETA: 6:58 - loss: 0.4353 - accuracy: 0.820 - ETA: 1:39 - loss: 0.4378 - accuracy: 0.82 - 12997s 3s/step - loss: 0.4374 - accuracy: 0.8204 - val_loss: 0.4065 - val_accuracy: 0.8769
Epoch 4/4
3931/3931 [==============================] - ETA: 3:19:12 - loss: 0.4999 - accuracy: 0.810 - ETA: 3:13:36 - loss: 0.4518 - accuracy: 0.825 - ETA: 3:08:18 - loss: 0.4464 - accuracy: 0.826 - ETA: 3:03:24 - loss: 0.4385 - accuracy: 0.825 - ETA: 2:58:52 - loss: 0.4385 - accuracy: 0.826 - ETA: 2:53:35 - loss: 0.4339 - accuracy: 0.825 - ETA: 2:48:13 - loss: 0.4662 - accuracy: 0.811 - ETA: 2:43:02 - loss: 0.4660 - accuracy: 0.811 - ETA: 2:37:49 - loss: 0.4609 - accuracy: 0.815 - ETA: 2:32:42 - loss: 0.4638 - accuracy: 0.816 - ETA: 2:27:37 - loss: 0.4694 - accuracy: 0.813 - ETA: 2:22:25 - loss: 0.4592 - accuracy: 0.818 - ETA: 2:17:16 - loss: 0.4590 - accuracy: 0.819 - ETA: 2:12:02 - loss: 0.4574 - accuracy: 0.820 - ETA: 2:06:47 - loss: 0.4532 - accuracy: 0.822 - ETA: 2:01:35 - loss: 0.4654 - accuracy: 0.816 - ETA: 1:56:20 - loss: 0.4732 - accuracy: 0.812 - ETA: 1:51:06 - loss: 0.4764 - accuracy: 0.811 - ETA: 1:45:54 - loss: 0.4862 - accuracy: 0.805 - ETA: 1:40:41 - loss: 0.4912 - accuracy: 0.803 - ETA: 1:35:29 - loss: 0.4930 - accuracy: 0.801 - ETA: 1:30:17 - loss: 0.4986 - accuracy: 0.797 - ETA: 1:25:03 - loss: 0.5044 - accuracy: 0.793 - ETA: 1:19:50 - loss: 0.5032 - accuracy: 0.792 - ETA: 1:14:37 - loss: 0.4999 - accuracy: 0.794 - ETA: 1:09:24 - loss: 0.4958 - accuracy: 0.796 - ETA: 1:04:11 - loss: 0.4954 - accuracy: 0.795 - ETA: 58:59 - loss: 0.4943 - accuracy: 0.7971  - ETA: 53:45 - loss: 0.4943 - accuracy: 0.796 - ETA: 48:33 - loss: 0.4902 - accuracy: 0.799 - ETA: 43:20 - loss: 0.4883 - accuracy: 0.799 - ETA: 38:07 - loss: 0.4882 - accuracy: 0.799 - ETA: 32:55 - loss: 0.4874 - accuracy: 0.800 - ETA: 27:42 - loss: 0.4839 - accuracy: 0.802 - ETA: 22:29 - loss: 0.4809 - accuracy: 0.804 - ETA: 17:16 - loss: 0.4825 - accuracy: 0.803 - ETA: 12:03 - loss: 0.4821 - accuracy: 0.803 - ETA: 6:50 - loss: 0.4810 - accuracy: 0.804 - ETA: 1:37 - loss: 0.4816 - accuracy: 0.80 - 12786s 3s/step - loss: 0.4823 - accuracy: 0.8031 - val_loss: 0.3392 - val_accuracy: 0.8911
1
Could you try with a batch_size of 32? - yudhiesh
Sure, could you explain your reasoning? - Fatih Enes
In practice when using a larger batch there is a significant degradation in the quality of the model, as measured by its ability to generalize. But usually we try with a batch_size of 32 as a starting point. No real reasoning behind that but it just works well. - yudhiesh
I see, thank you for your explanation. I will get try it and see if my results change. - Fatih Enes

1 Answers

0
votes

I think your model is too small and is overfitted on the third epoch.