Dropout rate in bottle neck layers

Question

It is common to use a dropout rate of 0.5 as a default which I also use in my fully-connected network. This advise follows the recommendations from the original Dropout paper (Hinton at al).

My network consists of fully-connected layers of size

[1000, 500, 100, 10, 100, 500, 1000, 20].

I do not apply dropout to the last layer. But I do apply it to the bottle neck layer of size 10. This does not seem reasonable given that dropout = 0.5. I guess to much information gets lost. Is there a rule of thumb how to treat bottle neck layers when using dropout? Is it better to increase the size of the bottle neck or decrease dropout rate?

Waseem AHmed Waseem AHmed · Accepted Answer · 2018-11-21T11:17:50

Drop out layer is added to prevent over-fitting(relgularization) in neural Network.

Firstly Drop out rate adds noise in output values of layer to break happenstance patterns that cause overfitting .

here droput rate of 0.5 means 50% of values shall be droped out, which is a high noise ratio and a definite No for bottle neck layer.

I would recommend you train your bottle neck layer without dropout first and then compare its results with increasing dropout.

choose the model that best validates your test Data.

Dropout rate in bottle neck layers

1 Answers