Output of convolution

Question

Suppose we have an 5x5 size image and a 3x3 size kernel with Stride 2 and Padding On. What is the size of the output image after passing through a convolution layer in neural networks.

brbr brbr · Accepted Answer · 2018-10-22T20:16:20

The other answer is correct, but here is a drawing which visualizes why this formula holds:

I: Image size, K: Kernel size, P: Padding, S: Stride

I will explain the formula for a single direction only (shifting the filter to the right), since its the same principle for the other direction.

Imagine, you place the kernel (the filter) in the upper left corner of the padded image.

Then there are I-K+2P pixels left over on the right hand side. If your stride is S, you will be able to place the kernel on this remaining part at floor( (I-K+2*P)/S ) positions. You can verify that you need "floor" for an image which has 4x4 pixels. You have to add one for the initial position of the kernel, to get the total number of kernel-positions.

Thus there are floor( (I-K+2*P)/S ) + 1 positions in total - which is the formula for your output size. Hope that helps.

Output of convolution

2 Answers