8
votes

I am trying to do a multiclass classification in keras. Till now I am using categorical_crossentropy as the loss function. But since the metric required is weighted-f1, I am not sure if categorical_crossentropy is the best loss choice. I was trying to implement a weighted-f1 score in keras using sklearn.metrics.f1_score, but due to the problems in conversion between a tensor and a scalar, I am running into errors.

Something like this:

def f1_loss(y_true, y_pred):
   return 1 - f1_score(np.argmax(y_true, axis=1), np.argmax(y_pred, axis=1), average='weighted')

Followed by

 model.compile(loss=f1_loss, optimizer=opt)

How do I write this loss function in keras?

Edit:

Shape for y_true and y_pred is (n_samples, n_classes) in my case it is (n_samples, 4)

y_true and y_pred both are tensors so sklearn's f1_score cannot work directly on them. I need a function that calculates weighted f1 on tensors.

1
Please post the shapes for y_true and y_pred.Mihai Alexandru-Ionut
Shape is (n_samples, n_classes) in my case it was (n_samples, 4)Nikhil Mishra
kaggle.com/rejpalcz/best-loss-function-for-f1-score-metric this is non weighted f1-loss implemented for 2 classesNikhil Mishra

1 Answers

8
votes

The variables are self explained:

def f1_weighted(true, pred): #shapes (batch, 4)

    #for metrics include these two lines, for loss, don't include them
    #these are meant to round 'pred' to exactly zeros and ones
    #predLabels = K.argmax(pred, axis=-1)
    #pred = K.one_hot(predLabels, 4) 


    ground_positives = K.sum(true, axis=0) + K.epsilon()       # = TP + FN
    pred_positives = K.sum(pred, axis=0) + K.epsilon()         # = TP + FP
    true_positives = K.sum(true * pred, axis=0) + K.epsilon()  # = TP
        #all with shape (4,)
    
    precision = true_positives / pred_positives 
    recall = true_positives / ground_positives
        #both = 1 if ground_positives == 0 or pred_positives == 0
        #shape (4,)

    f1 = 2 * (precision * recall) / (precision + recall + K.epsilon())
        #still with shape (4,)

    weighted_f1 = f1 * ground_positives / K.sum(ground_positives) 
    weighted_f1 = K.sum(weighted_f1)

    
    return 1 - weighted_f1 #for metrics, return only 'weighted_f1'

Important notes:

This loss will work batchwise (as any Keras loss).

So if you are working with small batch sizes, the results will be unstable between each batch, and you may get a bad result. Use big batch sizes, enough to include a significant number of samples for all classes.

Since this loss collapses the batch size, you will not be able to use some Keras features that depend on the batch size, such as sample weights, for instance.