Precision and Recall computation for different group sizes

Question

I didn't find an answer for this question anywhere, so I hope someone here could help me and also other people with the same problem.

Suppose that I have 1000 Positive samples and 1500 Negative samples.

Now, suppose that there are 950 True Positives (positive samples correctly classified as positive) and 100 False Positives (negative samples incorrectly classified as positive).

Should I use these raw numbers to compute the Precision, or should I consider the different group sizes?

In other words, should my precision be:

TruePositive / (TruePositive + FalsePositive) = 950 / (950 + 100) = 90.476%

OR should it be:

(TruePositive / 1000) / [(TruePositive / 1000) + (FalsePositive / 1500)] = 0.95 / (0.95 + 0.067) = 93.44%

In the first calculation, I took the raw numbers without any consideration to the amount of samples in each group, while in the second calculation, I used the proportions of each measure to its corresponding group, to remove the bias caused by the groups' different size

Nikita Astrakhantsev Nikita Astrakhantsev · Accepted Answer · 2015-12-19T22:41:57

Answering the stated question: by definition, precision is computed by the first formula: TP/(TP+FP).

However, it doesn't mean that you have to use this formula, i.e. precision measure. There are many other measures, look at the table on this wiki page and choose the one most suited for your task.

For example, positive likelihood ratio seems to be the most similar to your second formula.

Precision and Recall computation for different group sizes

1 Answers