7
votes

I am using tensorflow to train a convnet with a set of 15000 training images with 22 classes. I have 2 conv layers and one fully connected layer. I have trained the network with the 15000 images and have experienced convergence and high accuracy on the training set.

However, my test set is experiencing much lower accuracy so I am assuming the network is over fitting. To combat this I added dropout before the fully connected layer of my network.

However, adding dropout has caused the network to never converge after many iterations. I was wondering why this may be. I have even used a high dropout probability (keep probability of .9) and have experienced the same results.

3
The higher the dropout, the less I would expect it to converge. Did you try lower dropout rates?Martin Thoma
Well he might be talking about setting keep_prob to 0.9 which will only zero out 10% of the neurons. If you are in fact zeroing out 90% of the neurons this would be the problem. What usually helps me when a model is not converging is lowering the learning rate by a factor of 10. See if that helps.chasep255
Thanks I will give that a try. Yeah my bad I meant that my keep_prob was .9.Sam K
I just ask this question because I am new to machine learning and everything I read about dropout seems to be positive. The resources just talk about how dropout will reduce overfitting. But I am curious what the negative effects of dropout could be.Sam K
Well it could cause underfitting if you dropout too many neurons.chasep255

3 Answers

1
votes

Well by making your keep dropout probability 0.9 it means there's 10% chance of that neuron connection getting off in each iteration .So for dropout also there should be an optimum value.

This is taken from cs231 course

As in the above you can understand with the dropout we are also scaling our neurons. The above case is 0.5 drop out . If it's o.9 then again there will a different scaling .

So basically if it's 0.9 dropout keep probability we need to scale it by 0.9. Which means we are getting 0.1 larger something in the testing .

Just by this you can get an idea how dropout can affect . So by some probabilities it can saturate your nodes etc which causes the non converging issue..

0
votes

You can add dropout to your dense layers after convolutional layers and remove dropout from convolutional layers. If you want to have many more examples, you can put some white noise (5% random pixels) on each picture and have P, P' variant for each picture. This can improve your results.

0
votes

You shouldn't put 0.9 for dropout with doing this you are losing feature in your training phase. As far as I've seen most of the dropouts have had a value between 0.2 or 0.5. However, using too much dropout could cause some problems in the training phase and a longer time to converge or even in some rare cases cause the network to learn something wrong. you need to be careful with using of dropout as you can see the image below dropout prevents features from getting to the next layer to using too many dropout or a very high dropout value could kill the learning
DropoutImage