1
votes

For example, we have RGB-image with 3 channels (Red, Green, Blue). And we use convolutional neural network.

Does each convolutional filter always have 3 different coefficients for each of the channels (R,G,B) of image?

  1. I.e. does filter-W1 has 3 different coefficient matrices: W1[::0], W1[::1], W1[::2] as shown in the picture below?

  2. Or are often used the same coefficients in one filter in modern neural networks (W1[::0] = W1[::1] = W1[::2])?

enter image description here

Taken by link: http://cs231n.github.io/convolutional-networks/


Also: http://cs231n.github.io/convolutional-networks/#conv

Convolutional Layer

...

The extent of the connectivity along the depth axis is always equal to the depth of the input volume. It is important to emphasize again this asymmetry in how we treat the spatial dimensions (width and height) and the depth dimension: The connections are local in space (along width and height), but always full along the entire depth of the input volume.

1

1 Answers

1
votes

Here what is represented is the first hidden (here convolutional layer). Every single filter has a 3 channels because your input (for this layer your images) has 3 channels (RGB). Resulting in 2 feature maps that you concatenate (that explains the Output Volume of (3x3)x2 size).

More generally, for an input of (for simplicity let's consider a batche size of 1) of size (1x)WxHxC, every filter will have a size of NxNxC (for simplicity let's consider a stride of 1 and a 'SAME' padding even if for your example it is a 'VALID' padding), so for F filters yout output will have a size of (1x)WxHxF.

Hope it is clear enough (for your example W = H = 7, C = 3, N = 3 and F = 2).

Do not hesitate to comment if it is not clear enough :)