As they said, there is no "magic" rule to calculate the number of hidden layers and nodes of Neural Network, but there are some tips or recomendations that can helps you to find the best ones.
The number of hidden nodes is based on a relationship between:
- Number of input and output nodes
- Amount of training data available
- Complexity of the function that is trying to be learned
- The training algorithm
To minimize the error and have a trained network that generalizes well, you need to pick an optimal number of hidden layers, as well as nodes in each hidden layer.
Too few nodes will lead to high error for your system as the predictive factors might be too complex for a small number of nodes to capture
Too many nodes will overfit to your training data and not generalize well
You could find some general advices on this page:
Section - How many hidden units should I use?
If your data is linearly separable then you don't need any hidden layers at all. Otherwise there is a consensus on the performance difference from adding additional hidden layers: the situations in which performance improves with a second (or third, etc.) hidden layer are very small. Therefore, one hidden layer is sufficient for the large majority of problems.
There are some empirically-derived rules-of-thumb, of these, the most commonly relied on is 'the optimal size of the hidden layer is usually between the size of the input and size of the output layers'.
In sum, for most problems, one could probably get decent performance by setting the hidden layer configuration using just two rules:
- The number of hidden layers equals one
- The number of neurons in that layer is the mean of the neurons in the input and output layers.