0
votes

I have been working on creating a convolutional neural network from scratch, and am a little confused on how to treat kernel size for hidden convolutional layers. For example, say I have an MNIST image as input (28 x 28) and put it through the following layers.

Convolutional layer with kernel_size = (5,5) with 32 output channels

  • new dimension of throughput = (32, 28, 28)

Max Pooling layer with pool_size (2,2) and step (2,2)

  • new dimension of throughput = (32, 14, 14)

If I now want to create a second convolutional layer with kernel size = (5x5) and 64 output channels, how do I proceed? Does this mean that I only need two new filters (2 x 32 existing channels) or does the kernel size change to be (32 x 5 x 5) since there are already 32 input channels?

Since the initial input was a 2D image, I do not know how to conduct convolution for the hidden layer since the input is now 3 dimensional (32 x 14 x 14).

1

1 Answers

0
votes

you need 64 kernel, each with the size of (32,5,5) .

depth(#channels) of kernels, 32 in this case, or 3 for a RGB image, 1 for gray scale etc, should always match the input depth, but values are all the same. e.g. if you have a 3x3 kernel like this : [-1 0 1; -2 0 2; -1 0 1] and now you want to convolve it with an input with N as depth or say channel, you just copy this 3x3 kernel N times in 3rd dimension, the following math is just like the 1 channel case, you sum all values in all N channels which your kernel window is currently on them after multiplying the kernel values with them and get the value of just 1 entry or pixel. so what you get as output in the end is a matrix with 1 channel:) how much depth you want your matrix for next layer to have? that's the number of kernels you should apply. hence in your case it would be a kernel with this size (64 x 32 x 5 x 5) which is actually 64 kernels with 32 channels for each and same 5x5 values in all cahnnels.

("I am not a very confident english speaker hope you get what I said, it would be nice if someone edit this :)")