Why does this semantic segmentation network have no softmax classification layer in Pytorch?

Question

I am trying to use the following CNN architecture for semantic pixel classification. The code I am using is here

However, from my understanding this type of semantic segmentation network typically should have a softmax output layer for producing the classification result.

I could not find softmax used anywhere within the script. Here is the paper I am reading on this segmentation architecture. From Figure 2, I am seeing softmax being used. Hence I would like to find out why this is missing in the script. Any insight is welcome.

Shai Shai · Accepted Answer · 2019-01-08T06:20:23

You are using quite a complex code to do the training/inference. But if you dig a little you'll see that the loss functions are implemented here and your model is actually trained using cross_entropy loss. Looking at the doc:

This criterion combines log_softmax and nll_loss in a single function.

For numerical stability it is better to "absorb" the softmax into the loss function and not to explicitly compute it by the model.
This is quite a common practice having the model outputs "raw" predictions (aka "logits") and then letting the loss (aka criterion) do the softmax internally. If you really need the probabilities you can add a softmax on top when deploying your model.

Why does this semantic segmentation network have no softmax classification layer in Pytorch?

1 Answers