
I'm working on a multiclass classification problem using python and scikit-learn. Currently, I'm using the classification_report function to evaluate the performance of my classifier, obtaining reports like the following:

>>> print(classification_report(y_true, y_pred, target_names=target_names))
             precision    recall  f1-score   support

    class 0       0.50      1.00      0.67         1
    class 1       0.00      0.00      0.00         1
    class 2       1.00      0.67      0.80         3

avg / total       0.70      0.60      0.61         5

To do further analysis, I'm interesting in obtaining the per-class f1 score of each of the classes available. Maybe something like this:

>>> print(calculate_f1_score(y_true, y_pred, target_class='class 0'))

Is there something like that available on scikit-learn?


3 Answers


Taken from the f1_score docs.

from sklearn.metrics import f1_score
y_true = [0, 1, 2, 0, 1, 2]
y_pred = [0, 2, 1, 0, 0, 1]

f1_score(y_true, y_pred, average=None)


array([ 0.8,  0. ,  0. ])

Which is the scores for each class.


If you only have the confusion matrix C, with rows corresponding to predictions and columns corresponding to truth, you can compute F1 score using the following function:

def f1(C):
    num_classes = np.shape(C)[0]
    f1_score = np.zeros(shape=(num_classes,), dtype='float32')
    weights = np.sum(C, axis=0)/np.sum(C)

    for j in range(num_classes):
        tp = np.sum(C[j, j])
        fp = np.sum(C[j, np.concatenate((np.arange(0, j), np.arange(j+1, num_classes)))])
        fn = np.sum(C[np.concatenate((np.arange(0, j), np.arange(j+1, num_classes))), j])
#         tn = np.sum(C[np.concatenate((np.arange(0, j), np.arange(j+1, num_classes))), np.concatenate((np.arange(0, j), np.arange(j+1, num_classes)))])

        precision = tp/(tp+fp) if (tp+fp) > 0 else 0
        recall = tp/(tp+fn) if (tp+fn) > 0 else 0
        f1_score[j] = 2*precision*recall/(precision + recall)*weights[j] if (precision + recall) > 0 else 0

    f1_score = np.sum(f1_score)
    return f1_score

You just need to use pos_label as parameter and assign the class value which you want to print.

f1_score(ytest, ypred_prob, pos_label=0)# default is pos_label=1