I learned from several articles that to compute the gradients for the filters, you just do a convolution with the input volume as input and the error matrix as the kernel. After that, you just subtract the filter weights by the gradients(multiplied by the learning rate). I implemented this process but it's not working.
I even tried doing the backpropagation process myself with pen and paper but the gradients I calculated doesn't make the filters perform any better. So am I understanding the whole process wrong?
Edit: I will provide an example of my understanding of the backpropagation in CNNs and the problem with it.
Consider a randomised input matrix for a convolutional layer:
1, 0, 1
0, 0, 1
1, 0, 0
And a randomised weight matrix:
1, 0
0, 1
The output would be (applied ReLU activator):
1, 1
0, 0
The target for this layer is a 2x2 matrix filled with zeros. This way, we know the weight matrix should be filled with zeros also.
Error:
-1, -1
0, 0
By applying the process as stated above, the gradients are:
-1, -1
1, 0
So the new weight matrix is:
2, 1
-1, 1
This is not getting anywhere. If I repeat the process, the filter weights just go to extremely high values. So I must have made a mistake somewhere. So what is it that I'm doing wrong?