Batch Normalization: Axis on which to take mean and variance

Question

I am trying to implement Batch Normalization (http://arxiv.org/pdf/1502.03167.pdf) in my convolutional neural network, but I am really confused as around what axis I should calculate mean and variance.

If an input to the conv-layer is of shape 3 * 224 * 224 * 32
where:
3- input channels.
224 * 224- shape of single channel
32- minibatch size

What should be the axis in the following formula
Mean = numpy.mean(input_layer, axis= ? )

And, if an input to the fully connected layer is of shape 100 * 32
where:
100- number of inputs
32- minibatch size

Again, what should be the axis in the following formula
Mean = numpy.mean(input_layer, axis= ? )

dontloo dontloo · Accepted Answer · 2016-03-04T02:17:54

# 1. axis = (1,2,3)
numpy.mean(input_layer,axis=(1,2,3)) 
# 2. axis = 1
numpy.mean(input_layer,axis=1)

For convolutional layers with shared weights it uses feature-wise normalization, for fully connected layers it uses sample-wise normalization.

Code of the BN layer of the Keras library for reference: https://github.com/fchollet/keras/blob/0daec53acbf4c3df6c054b36ece5c1ae2db55d86/keras/layers/normalization.py

Batch Normalization: Axis on which to take mean and variance

1 Answers