I'm looking through the convolutional net layers tutorial for Tensorflow: https://www.tensorflow.org/tutorials/layers#dense_layer
In the tutorial, the first two layers are like this:
conv1 = tf.layers.conv2d(inputs=input_layer,filters=32,kernel_size=[5,5],padding="same",activation=tf.nn.relu)
pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)
After the first convolutional layer, the tutorial says: "Our output tensor produced by conv2d() has a shape of [batch_size, 28, 28, 32]: the same width and height dimensions as the input, but now with 32 channels holding the output from each of the filters."
But after the first pooling layer, the tutorial says: "Our output tensor produced by max_pooling2d() (pool1) has a shape of [batch_size, 14, 14, 1]: the 2x2 filter reduces width and height by 50%."
Shouldn't the pooling layer actually produce a tensor of shape [batch_size, 14, 14, 32], because the pooling2d operation is supposed to pool over the height and width axes but not the channels axis? This is consistent with layer 2 further in the tutorial:
"conv2 has a shape of [batch_size, 14, 14, 64], the same width and height as pool1 (due to padding="same"), and 64 channels for the 64 filters applied.
Pooling layer #2 takes conv2 as input, producing pool2 as output. pool2 has shape [batch_size, 7, 7, 64] (50% reduction of width and height from conv2)."
Thanks for looking.