When creating a convolutional neural network (CNN) (e.g. as described in https://cs231n.github.io/convolutional-networks/) the input layer is connected with one or several filters, each representing a feature map. Here, each neuron in a filter layer is connected with just a few neurons of the input layer. In the most simple case each of my n filters has the same dimensionality and uses the same stride.
My (tight-knitted) questions are:
- How is ensured that the filters learn different features, although they are trained with the same patches?
- "Depends" the learned feature of a filter on the randomly assigned values (for weights and biases) when initiating the network?