1
votes

Implementing a Convolutional neural network for image classification, I have good results on my training set accuracy (98%), but testing set accuracy remains low (around 82%). I am facing an overfitting issue. I tried to solve it by implementing a dropout function (keeping 50% of neurons), but it turns out that this solution makes my training accuracy drops to 80%! Does it mean that I should have more neurons in my CNN? What could I do to prevent this issue?

I am using Tensorflow library and here is the piece of code for the training loop : (I am not sure I implemented it correctly)

for epoch in range(22000):
    permutation=np.random.permutation(19600)
    permutation=permutation[0:200]
    batch=[train_set[permutation],train_label[permutation]]
    summary,model,cost=sess.run([merged,train_step,cross_entropy],feed_dict={X:batch[0],Yreal:batch[1],drop_value:0.5}) #training with 50% of the neurons
    train_writer.add_summary(summary,epoch)
    print(cost)
    if epoch%500==0:
        print(epoch)
        summary=sess.run(merged,feed_dict={X:test_set[0:300],Yreal:test_label[0:300],drop_value:1}) #evaluating testing set accuracy using 100% of the neurons
        test_writer.add_summary(summary,epoch)
1
Isn't dropout kind of supposed to make your training accuracy decrease? After all, dropout means that you're training using a smaller, less powerful network. What makes you say that there's an issue?Tanner Swett
Actually you are right, I wasn't thinking about it this way. But if I want to achieve good testing accuracy, I have to keep my training accuracy high, right? Is there any way to balance between good training accuracy and nice generalization? My first idea is to train an over powerful network... not sure it could work. I'll try and feedback here.Nico P

1 Answers

3
votes

If your test accuracy is also around the same mark as traning accuracy in case of dropout, then you can increase number of neurons. Also, 0.50 as initial dropout is pretty high. Start with 0.25(i.e. keep probability of 0.75 for dropout layer.).Also, I would recommend using some data augmentation like rotation, distorting brightness, swapping color channels depending upon nature of the data and using some regularization too. Also plot learning curves and check how test accuracy changes with training accuracy.