I am currently working on the MNIST handwritten digits classification.
I built a single FeedForward network with the following structure:
- Inputs: 28x28 = 784 inputs
- Hidden Layers: A single hidden layer with 1000 neurons
- Output Layer: 10 neurons
All the neurons have Sigmoid activation function.
The reported class is the one corresponding to the output neuron with the maximum output value
My questions are:
- Is it a good approach to create a single network with multiple outputs? I.e. should I instead create a separated network per each digit?
I ask about it, as currently the network is stuck on ~75% success rate. As the actually "10 classifiers" share the same neurons of the hidden layer - I am not sure - does it reduce the network learning capability?
** EDIT: **
As other people may take reference of this thread, I want to be honest and update that the 75% success rate was after ~1500 epochs. Now I'm after nearly 3000 epochs and it's on ~85% of success rate - so it works pretty well