1
votes

I am currently building a CLDNN (Convolutional, LSTM, Deep Neural Network) model for raw signal classification.

Since the number of trainable parameters are easily over millions, I thought dropout would help prevent overfitting.

My question is also applicable to other networks with multiple models stacked.

If I have the network structured as

input -> convolution -> LSTM -> DNN -> output

Do I have to put a dropout after each layer or only right before the output?

input -> convolution -> dropout -> LSTM -> dropout -> DNN -> dropout -> output

or

input -> convolution -> LSTM -> DNN -> dropout -> output

So far, I've only seen dropout applied to convNets, but I don't see why it should be only restricted to convNets. Do other networks, such as LSTM and DNN, also use dropout to prevent overfitting?

1

1 Answers

2
votes

Yes, you can use Dropout after each of your layers.

It does not make sense to apply dropout to your last layer (the layer that produces the probability distribution over the classes), though.

Do not forget that LSTM is a recurrent model, so you have to use the DropoutWrapper class to apply dropout at each time step.