Fine-tuning CNN configuration

Question

I'm currently trying to create a CNN but am having issues figuring out what parameters to be using. Choosing a filter size, number of filters, and number of convolutional layers just isn't clicking with me. All the resources I have found or been given just say to 'look at your dataset' and determine them, which just isn't helpful.

I have a dataset with around 40000 instances and 500 classes. I'm using keras on top of Tensorflow-gpu. Currently I'm using 6 1D convolutional layers with tanh activation and using relu and softmax for my 2 dense layers. I do max pooling after every 2 conv layers. I have two dropouts (between my dense layers and after) with set to .4 each. With my current filter sizes and numbers, I'm getting around 35% accuracy but am expecting close to 80% based on current research.

I'm not necessarily looking for someone to tell me exactly what numbers or configurations I should be plugging into my model, but just some guidance on really how to even begin determining such values. I've really just guess and checked thus far. Additionally, I'm unsure about how the values should relate to each other: should I be doubling the number of filters each layer? How often should I repeat a filter size? etc.

I've looked at other questions on stack overflow and all the ones with answers seem to tell people to try simpler models or to base their work off of keras examples, which is not something that would work for my case.

Is your model overfitting or the train acc is also low? As a general strategy, I like to find an architecture capable of overfitting and then cut down some layers and add dropout/bach norm. I also use relu as the activation function in convolutional layers instead of tanh. Playing with the optimizer and learning rate could also help. Btw, is It possible to be a little more specific about the nature of your problem? Good luck! — Artur Lacerda
I'm not seeing overfitting, the train accuracy is low. Thanks for your suggestions. — user4096758

parsethis parsethis · Accepted Answer · 2018-03-30T02:55:10

More data.

My first inclination here is that you dont have enough training examples. You have 500 classes and 40000 examples that is 80 examples per class on average. That is of course assuming you have balanced classes which I would bet isn't the case. Just not going to do.

Fine-tuning CNN configuration

1 Answers