3
votes

Tensor flow implements a basic convolution operation with tf.nn.conv2d.

I am specifically interested in the "strides" parameter, which lets you set the stride of the convolution filter -- how far across the image you shift the filter each time.

The example given in one of the early tutorials, with an image stride of 1 in each direction, is

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

The strides array is explained more in the linked docs:

In detail, with the default NHWC format...

Must have strides[0] = strides[3] = 1. For the most common case of the same horizontal and vertices strides, strides = [1, stride, stride, 1].

Note the order of "strides" matches the order of inputs: [batch, height, width, channels] in the NHWC format.

Obviously having a stride of not 1 for batch and channels wouldn't make sense, right? (your filter should always go across every batch and every channel)

But why is it even an option to put something other than 1 in strides[0] and strides[3], then? (where it being an "option" is in regards to the fact that you could put something other than 1 in the python array you pass in, disregarding the documentation quote above)

Is there a situation where I would have a non-one stride for the batch or channels dimension, e.g.

tf.nn.conv2d(x, W, strides=[2, 1, 1, 2], padding='SAME')

If so, what would that example even mean in terms of the convolution operation?

1

1 Answers

-1
votes

There might be a situation where you send a video in chunks. That means your batch will be a sequence of frames. And assuming that close frames should be quite similar we can omit some of them by increasing batch stride. That as far as I understand. IDK about channel stride though