0
votes

I was trying to understand the difference between the use of a neural network with just one output neuron and one with multiple neurons in the output layers.

enter image description here

I know that with this type of neural network I can solve like a XOR logical gate, in fact, I can use a ANN with less neurons in the hidden layers.

enter image description here enter image description here

But I am not very clear when and why I should use a neural network with this kind of topology, where is seen that the ANN have multiple neurons in the output layer.

Does anyone know the difference?

1
I'm by no means an expert on neural networks, but doesn't each output neuron represent a single output bit? You would therefore have more than one output neuron when you need an answer requiring more than one bit. The bottom example contains two output bits, and therefore provides for four possible output states.Robert Harvey

1 Answers

0
votes

The network architectures (single output and multi-output) are specifically for binary, multi-class, and multi-label problems.

Let's consider the following options you have -

enter image description here

Binary classification - You are trying to predict the probability of getting a positive class. The positive and negative classes are the only 2 options in this case. The output in this case is a probability value between 0 and 1. The loss function used here is a binary_crossentropy

Multi-class classification - You are trying to predict the probability for multiple classes individually. You are trying to get a 0 to 1 probability prediction for each of the n classes (where n>=2). If each of the samples belongs to a single class then it's called multi-class single-label classification.

Multi-label classification - You have a situation where each sample can belong to multiple classes. Here you are working with a multi-class multi-label problem. This also gives you a 0 to 1 probability value for each of the n classes and the loss used in this case is the same as what you would use for binary classification.

So, at the end of the day, it's about how you are setting up your problem.