I trained a model for 50 epochs splitting the dataset with the following proportion:
- X_train, Y_train = 70%
- X_validation, Y_validation = 20%
- X_test, Y_test = 10%
All the splitting are done using the train_test_split(shuffle=True)
keras function:
X = np.load(....)
Y = np.load(....)
# Split on training and validation
N_validation = int(len(X) * 0.2)
X_train, X_validation, Y_train, Y_validation = train_test_split(X, Y, test_size=N_validation)
# Split Train data once more for Test data
N_test = int(len(X_train) * 0.1)
X_train, X_test, Y_train, Y_test = train_test_split(X_train, Y_train, test_size=N_test)
Here is the history plot.
As you can see from the history, the validation accuracy/loss is very similar to the training accuracy/loss. Sometimes the validation loss is even lower than the training loss. As for this last statement, I read here that this could be caused to an high dropout value. This could be the case since I have a dropout layer with rate=0.3. What I didn't understand is whether this is a problem or not.
Testing the model on the Test set, I have an accuracy of 91%.