I am trying to build a multi-task CNN in Tensorflow which has two dense dense layers in parallel ,one for Age prediction and other for Gender prediction. How can I train each Dense layer for different number of epochs since one can converge before the other and training both for same no of epochs would overfit one of them?
Also, if I propagate the gradients of both age and gender to the CNN, would it overfit since it's weights are being updated at twice the rate of Dense layers?