Consider a Convolutional Neural Network with the following architecture:
Here refers to the convolutional layer and refers to the mean pooling layer. Corresponding to each layer will be an output. Let refer to the error in the output of layer (and same for ).
can be calculated easily using normal backpropagation equations since it is fully connected to the softmax layer. can be calculated simply by upsampling appropriately (and multiplying by gradient of output of ) since we are using mean pooling.
How do we propagate error from the output of to the output of ? In other words, how do we find from ?
Standford's Deep Learning tutorial uses the following equation to do this:
However I am facing the following problems in using this equation:
My has size (2x2) and has size (6x6), (I am using valid convolution, output of has size (13x13) and output of has size (6x6)). This inner matrix multiplication does not even makes sense in my case.
Equation assumes that the number of channels in both layers is same. Again this is not true for me. Output of has 64 channels while output of has 96 channels.
What am I doing wrong here? Can anybody please explain how to propagate errors through a convolutional layer?
Simple MATLAB example will be highly appreciated.