How to overcome overfitting in CNN - standard methods don't work

Question

I've been recently playing around with car data set from Stanford (http://ai.stanford.edu/~jkrause/cars/car_dataset.html). From the very beginning I had an overfitting problem so decided to:

Add regularization (L2, dropout, batch norm, ...)
Tried different architectures (VGG16, VGG19, InceptionV3, DenseNet121, ...)
Tried trasnfer learning using models trained on ImageNet
Used data augmentation

Every step moved me a little bit forward. However I finished with 50% validation accuracy (started below 20%) compared to 99% train accuracy.

Do you have an idea what more can I do to get to around 80-90% accuracy?

Hope this can help some people!:)

50% validation accuracy means, its no better than random. Does your training set resemble validation set? If all architecture failed, it must be something to do with this. Training batches are shuffled? — Littleone
@Littleone He has 196 classes, so random would be around 0.5% accuracy. — Imran

Daniele Grattarola Daniele Grattarola · Accepted Answer · 2018-02-04T15:18:32

Things you should try include:

Early stopping, i.e. use a portion of your data to monitor validation loss and stop training if performance does not improve for some epochs.
Check whether you have unbalanced classes, use class weighting to equally represent each class in the data.
Regularization parameter tuning: different l2 coefficients, different dropout values, different regularization constraints (e.g. l1).

Other general suggestions may be to try and replicate the state of the art models on this particular dataset, see if those perform as they should.
Also make sure to have all implementation details ironed out (e.g. convolution is being performed along width and height, and not along the channels dimension - this is a classic rookie mistake when starting out with Keras, for instance).

It would also help to have some more details on the code that you are using, but for now these suggestions will do.
50% accuracy on a 200-class problem doesn't sound so bad anyway.

Cheers

How to overcome overfitting in CNN - standard methods don't work

2 Answers