I'm following the "Deep MNIST for Experts" tutorial for TensorFlow: https://www.tensorflow.org/tutorials/mnist/pros/
The second convolutional layer has the shape [5, 5, 32, 64]; that is, it has 32 inputs whereas the first convolutional layer had 1 input (that input being I understand the grayscale values of the original image).
What does it mean that the second convolutional layer has 32 input channels? Does it mean the 64 filters that are learned in the second layer will all be applied (shifted around) to a "virtual" image having 32 points per pixel (this "virtual" image being composed of the original image to which each filter learned in the first step has been applied)? How do you apply a 2D 5x5 filter to an image having 32 points/values per pixel if what I said previously is correct?