I am working on a deep learning (CNN + AEs) approach on facial images.
I have
an input layer of
112*112*3of facial images3 convolution + max pooling + ReLU
2 layers of fully connected with 512 neurons with 50% dropout to avoid overfitting and last output layer with 10 neurons since I have 10 classes.
also used reduce mean of softmax cross entropy and also L2.
For training I divided my dataset to 3 groups of:
- 60% for training
- 20% for validation
- 20% for evaluation
The problem is after few epochs the validation error rate stay fixed value and never changes. I have used tensorflow to implement my project.
I hadn't such problem before with CNNs so I think it's first time. I have checked the code it's based on tensorflow documentation so I don't think if the problem is with the code. Maybe I need to change some parameters but I am not sure.
Any idea about common solutions for such problem?
Update: I changed the optimizer from momentum to Adam whith default learning rate. For now validation error changes but it's lower than mini batch error most of the time while both have same batch sizes.
I have tested the model with and without biases with 0.1 as initial values but no good fit yet.
Update I fixed the issue I will update with more details soon.