Is it normal in PyTorch for accuracy to increase and decrease repeatedly?

Question

I am new to PyTorch, currently working on a Transfer Learning simple code. When I am training my model, I am getting a big variance between increase and decrease of the accuracy and loss. I trained the network for 50 epochs, and below is the result:

Epoch [1/50], Loss: 0.5477, Train Accuracy: 63%
Epoch [2/50], Loss: 2.1935, Train Accuracy: 75%
Epoch [3/50], Loss: 1.8811, Train Accuracy: 79%
Epoch [4/50], Loss: 0.0671, Train Accuracy: 77%
Epoch [5/50], Loss: 0.2522, Train Accuracy: 80%
Epoch [6/50], Loss: 0.0962, Train Accuracy: 88%
Epoch [7/50], Loss: 1.8883, Train Accuracy: 74%
Epoch [8/50], Loss: 0.3565, Train Accuracy: 83%
Epoch [9/50], Loss: 0.0228, Train Accuracy: 81%
Epoch [10/50], Loss: 0.0124, Train Accuracy: 81%
Epoch [11/50], Loss: 0.0252, Train Accuracy: 84%
Epoch [12/50], Loss: 0.5184, Train Accuracy: 81%
Epoch [13/50], Loss: 0.1233, Train Accuracy: 86%
Epoch [14/50], Loss: 0.1704, Train Accuracy: 82%
Epoch [15/50], Loss: 2.3164, Train Accuracy: 79%
Epoch [16/50], Loss: 0.0294, Train Accuracy: 85%
Epoch [17/50], Loss: 0.2860, Train Accuracy: 85%
Epoch [18/50], Loss: 1.5114, Train Accuracy: 81%
Epoch [19/50], Loss: 0.1136, Train Accuracy: 86%
Epoch [20/50], Loss: 0.0062, Train Accuracy: 80%
Epoch [21/50], Loss: 0.0748, Train Accuracy: 84%
Epoch [22/50], Loss: 0.1848, Train Accuracy: 84%
Epoch [23/50], Loss: 0.1693, Train Accuracy: 81%
Epoch [24/50], Loss: 0.1297, Train Accuracy: 77%
Epoch [25/50], Loss: 0.1358, Train Accuracy: 78%
Epoch [26/50], Loss: 2.3172, Train Accuracy: 75%
Epoch [27/50], Loss: 0.1772, Train Accuracy: 79%
Epoch [28/50], Loss: 0.0201, Train Accuracy: 80%
Epoch [29/50], Loss: 0.3810, Train Accuracy: 84%
Epoch [30/50], Loss: 0.7281, Train Accuracy: 79%
Epoch [31/50], Loss: 0.1918, Train Accuracy: 81%
Epoch [32/50], Loss: 0.3289, Train Accuracy: 88%
Epoch [33/50], Loss: 1.2363, Train Accuracy: 81%
Epoch [34/50], Loss: 0.0362, Train Accuracy: 89%
Epoch [35/50], Loss: 0.0303, Train Accuracy: 90%
Epoch [36/50], Loss: 1.1700, Train Accuracy: 81%
Epoch [37/50], Loss: 0.0031, Train Accuracy: 81%
Epoch [38/50], Loss: 0.1496, Train Accuracy: 81%
Epoch [39/50], Loss: 0.5070, Train Accuracy: 76%
Epoch [40/50], Loss: 0.1984, Train Accuracy: 77%
Epoch [41/50], Loss: 0.1152, Train Accuracy: 79%
Epoch [42/50], Loss: 0.0603, Train Accuracy: 82%
Epoch [43/50], Loss: 0.2293, Train Accuracy: 84%
Epoch [44/50], Loss: 0.1304, Train Accuracy: 80%
Epoch [45/50], Loss: 0.0381, Train Accuracy: 82%
Epoch [46/50], Loss: 0.1833, Train Accuracy: 84%
Epoch [47/50], Loss: 0.0222, Train Accuracy: 84%
Epoch [48/50], Loss: 0.0010, Train Accuracy: 81%
Epoch [49/50], Loss: 1.0852, Train Accuracy: 79%
Epoch [50/50], Loss: 0.0167, Train Accuracy: 83%

There are some epochs that have a much better accuracy and loss than others. However, the model loses them in later epochs. As I know, the accuracy should improve every epoch. Did I write the training code wrongly? If not, then is that normal? Any way to solve it? Shall the previous accuracy be saved and only if the accuracy of the next epoch is greater than the previous one then train one more epoch? I have been working on Keras previously, and I haven't experienced that problem. I am fine tuning the resent by freezing previous weights and adding only 2 classes for the final layer. Below is my code:

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model_conv.fc.parameters(), lr=0.001, momentum=0.9)

num_epochs = 50
for epoch in range (num_epochs):
    #Reset the correct to 0 after passing through all the dataset
    correct = 0
    for images,labels in dataloaders['train']:
        images = Variable(images)
        labels = Variable(labels)
        if torch.cuda.is_available():
            images = images.cuda()
            labels = labels.cuda()

        optimizer.zero_grad()
        outputs = model_conv(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()  
        _, predicted = torch.max(outputs, 1) 
        correct += (predicted == labels).sum()

    train_acc = 100 * correct / dataset_sizes['train']    
    print ('Epoch [{}/{}], Loss: {:.4f}, Train Accuracy: {}%'
            .format(epoch+1, num_epochs, loss.item(), train_acc))

Alpha Romeo Alpha Romeo · Accepted Answer · 2018-12-07T15:18:18

I would say it depends on dataset and architecture. Hence, fluctuations are normal, but in general loss should improve.It could be a result of noise in the test dataset, i.e. wrongly labeled examples.

If the test accuracy starts to decrease it might be that your network is overfitting. You might want to stop the learning just before you reach that point or take other steps to counter the overfitting problem.

Is it normal in PyTorch for accuracy to increase and decrease repeatedly?

2 Answers