0
votes

For binary classification, we could either go for a final linear layer with 1 output, and use a sigmoid with a threshold, or a final linear layer with 2 outputs, and use a softmax. Is there any advantage to one vs the other?

1

1 Answers

0
votes

If you are doing a binary classification, i would suggest having 1 output node with sigmoid and if your problem is a multi class classification, i would suggest having as many nodes as number of labels with softmax.