I have split the dataset ( around 28K images) into 75% trainset and 25% testset. Then I have taken randomly 15% of trainset and randomly 15% of testset to create a validation set. The goal is to classify the images into two categories. The exact image sample can't be shared. But its similar to the one attached. I'm using this model: VGG19 with imagenet weights with last two layers trainable and 4 dense layers appended. I am also using ImageDataGenerator to Augment the images. I trained the model for 30 epochs and found that the training accuracy is 95% and Validation accuracy is 96% and when trained on test dataset it fell down enormously to 75% only.
I have tried regularization and dropout to tackle the overfitting if it is suffering. I have also done one more thing to see what happens if I use the testset as Validation set and test the model on the same testset. The results were: Trainset Acc = 96% and Validation ACC = 96.3% and the testAcc = 68%. I don't understand what should I Do ?