Underfitting Problem in Binary Classification using Multi-Layer Perceptron

Question

I'm currently developing a supervised anomaly detection using Multi-Layer Perceptron (MLP), the goal is to classify between the benign and malicious traffics. I used the CTU-13 dataset, the sample of the dataset is as follows: Sample of Dataset. The dataset has 169032 benign traffics and 143828 malicious traffics. The code for my MLP model is as follows:

def MLP_model():
model = Sequential()
model.add(Dense(1024,input_dim=15, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(256,activation='relu'))
model.add(Dense(256,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(128,activation='relu'))
model.add(Dense(128,activation='relu'))
model.add(Dense(1, activation='sigmoid'))

adam = optimizers.Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)

model.compile(optimizer = adam, loss='binary_crossentropy', metrics=['accuracy'])
return model    

model = MLP_model()

#With Callbacks
callbacks = [EarlyStopping('val_loss', patience=5)]
hist = model.fit(Xtrain, Ytrain, epochs=50, batch_size=50, validation_split=0.20, callbacks=callbacks, verbose=1)

The results that I obtained are as follows:

Accuracy: 0.923045
Precision: 0.999158
Recall: 0.833308
F1 score: 0.908728

However, from the training curve, I suspect that the model is underfitting (based on this article): The Model's Training Curve

I've tried to increase the neurons and the number of layers (as suggested here), but the same problem still occurs. I appreciate any help to solve this problem.

can you share the verbose? your problem seems to be more overfitting, because your validation loss is a lot higher than the train loss. so i would actually decrease the number of neurons by a factor of 4. also, did you scale your input between 0 and 1? — Nicolas Gervais

Priyatham Priyatham · Accepted Answer · 2020-06-19T12:45:36

First of all, I am pretty sure your model is actually overfitting, not underfitting. Plot only the training loss and you should see the loss fall close to 0. But as you can see in your plot the validation loss is still quite high compared to training loss. This happens because your model has way too many parameters compared to the number of data points you have in your training dataset.

I would recommend reducing your dense layer sizes to double digits/low triple digits.

Underfitting Problem in Binary Classification using Multi-Layer Perceptron

2 Answers