0
votes

I am learning convolutional neural network with Tensorflow.

I have some doubts regarding tf.nn.conv2d. One of its parameters is filter:

a filter / kernel tensor of shape [filter_height, filter_width, in_channels, out_channels]

I do not understand what is the meaning of out_channels.

Suppose input image is [1, 3, 3, 1]. So the size is 3xx and the channel is 1.
Then we have a filter [2, 2, 1, 5], which means after the filtering, we will have an image of size 2x2 ("valid" padding) with 5 channels.

Where are the 5 channels from? From my understanding, the filtering can only have 1 channel generated. Is Tensorflow using 5 different filter functions here?

1
Not duplicate. I am asking different question.It is a comment of that thread: Regarding this: "This still gives a 5x5 output image, but with 7 channels (size 1x5x5x7). Where each channel is produced by one of the filters in the set.", I still have difficulty understanding where the 7 channels are from? what do you mean "filters in the set"? Thanksderek

1 Answers

1
votes

The filter argument to the tf.nn.conv2d function, as you quoted, is a 4D tensor of dimensions [filter_height, filter_width, in_channels, out_channels]. This tensor represents a stack of out_channels filters of dimension filter_height x filter_width, to be applied over an image with in_channels channels.

The parameters, filter_height, filter_width and out_channels are defined by you, whereas input_channels is dependent on your input to tf.nn.conv2d.

In other words, a filter tensor with dimensions [2, 2, 1, 5], represents 5 different 2 x 2 filters to be applied over a 1-channel input, but you could perfectly change it to [2, 2, 1, 7], or whatever else gives you better results.

To further illustrate, in the following gif you have a [3, 3, 1, 1] tensor filter convolving over a [1, 5, 5, 1] image. This means you have only 1 filter being convolved over the image.

Convolution GIF

GIF source