In CNN network, should input image go to all neurons in first convolution layer (I mean first hidden layer) or not?

Question

I am new to CNN topic, I have one basic question regarding mapping between input image with neurons in first convolution layer.

My question is : should input image go to all neurons in first convolution layer (I mean first hidden layer) or not ?

For example: if my first hidden layer in CNN as 8 neurons, in that case complete input image is passed to all these 8 neurons or only set of pixels of input image is passed to each neuron.

Matus Dubrava Matus Dubrava · Accepted Answer · 2020-03-14T20:43:13

I am not sure whether you understand what you are asking because the number of neurons in a convolutional layer is something that you don't concern yourself too much with unless you are building your own implementation of a CNN.

To answer your question - No, each neuron in the first convolutional layer is connected only to pixels in its own receptive field (given by kernel size) and the same logic also applies to next convolutional layers as well except now they are connected to lower layer neurons in their receptive field.

For example: if my first hidden layer in CNN as 8 neurons

How do you know that it has 8 neurons? Unless you are doing some low level CNN programming, you do not specify the number of neurons. The number of neurons that you need is usually given by a combination of kernel size, stride, type of padding and the amount of filters that you have chosen to use. These 4 things (together with input size of the image) tells you exactly how many neurons you need.

For example, in Keras (since you have tagged this question with tensorflow) you might see convolutional layer like this one:

keras.layers.Conv2D(filters=128, kernel_size=3, strides=1, activation="relu",
                    input_shape=(100, 100, 1))

As you can see, you don't specify anything like the number of neurons here (at least not directly). Under this settings, you end up with output whose width and height are cropped by 2 (due to the default padding) so the output shape of this layer is (128, 98, 98) (technically it has a shape of (None, 128, 98, 98) where None is for batch size). If you would flatten this and feed the output into a single neuron (let's say in a dense layer), you would end up with 128 * 98 * 98 = 1,229,313 weights just between these two layers.

So for the analogy between dense and convolutional layer, the above convolutional layers with 128 filters connected to one output neuron is similar to having dense layer with 1,229,313 neurons connected to one output neuron.

In CNN network, should input image go to all neurons in first convolution layer (I mean first hidden layer) or not?

1 Answers