I am a biologist and starting to find my way to the world of Deep Learning. So, I have read a number of books and online tutorials. In short, I am building a model to use 522 variables in a dataset of 6500 records to predict a binary class by keras in R. The main codes for the mode are as follow:
model <- keras_model_sequential()
model %>%
layer_dense(units = 256, activation = 'relu', input_shape = ncol(x_train),kernel_regularizer = regularizer_l2(0.001),) %>%
layer_dropout(rate = 0.4) %>%
layer_dense(units = 128, activation = 'relu',kernel_regularizer = regularizer_l2(0.001),) %>%
layer_dropout(rate = 0.3) %>%
layer_dense(units = 2, activation = 'sigmoid')
history <- model %>% compile(
loss = 'binary_crossentropy',
optimizer = 'adam',
metrics = c('accuracy')
)
model %>% fit(
x_train, y_train,
epochs = 50,
batch_size = 150,
validation_split = 0.20
)
acc<-model %>% evaluate(x_test, y_test)
I have tuned the hidden unit in each layer, batch size, epochs to lower and higher but the accuracy I got is unsatisfactory. Specifically, the training accuracy quickly reached 70-90% depending on the number of hidden units I added, but in any cases the validation accuracy is never above 30%. When I applied the model to predict the test set, I got accuracy of 70%, but the problem is that when I looked further to the confusion matrix table, the model seemed to just predict well class 1 (sensitivity is 97%) why class 0 is poorly predicted (specificity is about 20%).
I actually ran the same data using PLS-DA with mixOmics package and the results are rather good. On test set, I got sensitivity, specificity, and area under the curve all >=70%.
So, I am not requiring deep learning (in this case) to be better that PLS-DA, but I hope it should be somewhere near PLS-DA.
Can you give me some advice so that I can move in the right direction to improve the model of deep learning. The train and test data that I am working on can be obtained here: https://drive.google.com/file/d/1XFmTosHk5hZABFgJOHgQGLiP-DnbGHLv/view?usp=sharing https://drive.google.com/file/d/10viyKknQNolgCR45mEijF5RIxKqMK23a/view?usp=sharing
Many thanks, Ho