I am currently studying this paper (page 53), in which the suggest convolution to be done in a special manner.
This is the formula:
Here is their explanation:
As shown in Fig. 4.2, all input feature maps (assume I in total), O_i (i = 1, · · · , I) are mapped into a number of feature maps (assume J in total), Q_j (j = 1, · · · , J) in the convolution layers based on a number of local filters (I × J in total), w_{ij} (i = 1, · · · , I; j = 1, · · · , J). The mapping can be represented as the well-known convolution operation in signal processing. Assuming input feature maps are all one dimensional, each unit of one feature map in the convolution layer can be computed as equation \ref{eq:equation} (equation above).
where o_{i,m} is the m-th unit of the i-th input feature map O_i, q_{j,m} is the m-th unit of the j-th feature map Q_j of the convolution layer, w_{i,j,n} is the nth element of the weight vector, w_{i,j}, connecting the ith feature map of the input to the jth feature map of the convolution layer, and F is called the filter size which is the number of input bands that each unit of the convolution layer receives.
So far so good:
What i basically understood from this is what I've tried to illustrate in this image.
It seem to me what they are doing is actually processing all data points up to F, and across all feature maps. Basically moving in both x-y direction, and compute on point from that.
Isn't that basically 2d- convolution on a 2d image of size (I x F) with a filter equal to the image size?. The weight doesn't seem to differ at all have any importance here..?
So why am I asking this here..
I am trying to implement this, I am uncertain of what they are doing, is actually just basic convolution, in which a sliding window feed keeps feeding new data, or is what they doing not normal convolution, meaning that I need design a special layer that does this operation?...