I am using Python 3.8 and PyTorch 1.7.1. I saw a code which defines a Conv2d layer as follows:
Conv2d(3, 6, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
The input 'X' being passed to it is a 4D tensor-
X.shape
# torch.Size([4, 3, 6, 6])
The output volume for this conv layer is:
c1(X).shape
# torch.Size([4, 6, 3, 3])
I am trying to use the formula to compute output spatial dimensions for any conv layer: O = ((W - K + 2P)/S) + 1, where W = spatial dimension of image, K = filter/kernel size, P = zero padding & S = stride.
For 'c1' conv layer, we get, W = 6, K = 3, S = 2 & P = 1. Using the formula, you get O = ((6 - 3 + (2 x 1)) / 2) + 1 = 5/2 + 1 = 3.5.
The output volume: (4, 6, 3, 3) since number of filters used = 6. How is the spatial output from 'c1' then (3, 3)? What am I not getting?
Thanks!