0
votes

I have a question about tf.layers.conv3d. If I have understood correctly, it takes an input of shape

(Batch x depth x height x width x channels)

where channels should be only one; and given a filter (depth x height x width), it creates #filters different filters of the same shape to create #filters output channels and convolves them with the input to obtain an output of shape

(Batch x out_depth x out_height x out_width x num_filters)

First of all, am I right by now? The question is: it appears to me that this layer does not obey to the law binding input, output, filter and strides shapes of convolutional layers, that should be:

(W-F+2P)/S + 1

As described here. Instead, output depth width and height are always the same as inputs. What is happening? Thanks for the help!

1
This happens if you use padding='same'BlackBear
Please illustrate the concrete input, kernel and output size, as well as the padding, stride and dilation options passed to conv3d. The formula you posted is correct, but it may not be apparent what is the precise value of each of those parameters (also, it does not include dilation, if you are using that). Btw, you can have multiple channels on the input, it does not need to be fixed to one.jdehesa

1 Answers

0
votes
kinda true but if input shape, filter shape and strides:
[Batch, depth, height, width, channels]
[filter_depth,filter_height,filter_width,in_channels,out_channels]
[1,s1,s2,s3,1]

output shape
[Batch,int(depth/s1),int(height/s2),int(width/s3),out_channels]

tf.layers.conv3d is special case of tf.layers.convolution

for understanding padding algorithm: https://www.tensorflow.org/api_guides/python/nn#Convolution

for understanding convolution operation: https://www.tensorflow.org/api_docs/python/tf/nn/convolution