is error value incorrect for output neurons?

Question

I use the fully connected neural network for image recognition "mnist".

My network has 784 input neurons, one hidden layer of neurons consists of 1569 neurons, an output layer consists of 10 ones.

I have two questions:

I use sigmoid and formula for error error = output * (1 - output) * (target - output). The problem is that if the output neuron is 1, and the required value is 0, that error = 0, but it's wrong, is n't?
Is it right to use sigmoid if weighted sum of neurons in the hidden layer becomes too large as the result is 1? What values to initialize the weights?

daniel451 daniel451 · Accepted Answer · 2015-06-07T00:51:46

Normally, I experienced good results with initializing weights in a random range of something like 0.01 to 0.5.

To 1: As far as I know the local error for output layer normally is expectedOutput - currentOutput, because this simplified statement never fails and has enough accuracy. After this, for fully-connected layers, you use backpropagation to adjust weights of hidden layers. See Yann Lecuns work for efficient: Efficient Backprop

To 2: To prevent to have an input of 1 to your output layer because the sum of the hiddens layer is too big and sigmoid delivers 1 for a huge amount of epochs you could do a simple, easy, efficient hack: always divide the input of each output layers neuron with the amount of neurons in the parent (hidden) layer, therefore your input is always in the interval [-1.0, 1.0] before the sigmoid transfer function is used. In most cases this trick reduces the amount of epochs needed to train the network drastically.

is error value incorrect for output neurons?

2 Answers