4
votes

I am reading about precision and recall in machine learning.

Question 1: When are precision and recall inversely related? That is, when does the situation occur where you can improve your precision but at the cost of lower recall, and vice versa? The Wikipedia article states:

Often, there is an inverse relationship between precision and recall, where it is possible to increase one at the cost of reducing the other. Brain surgery provides an obvious example of the tradeoff.

However, I have seen research experiment results where both precision and recall increase simultaneously (for example, as you use different or more features).

In what scenarios does the inverse relationship hold?

Question 2: I'm familiar with the precision and recall concept in two fields: information retrieval (e.g. "return 100 most relevant pages out of a 1MM page corpus") and binary classification (e.g. "classify each of these 100 patients as having the disease or not"). Are precision and recall inversely related in both or one of these fields?

2
The Wikipedia article says Often, there is an inverse relationship. That means not always, just often. You might change your title to something more like "When are precision and recall inversely related?"Eric J.
Thanks. Made the change.stackoverflowuser2010

2 Answers

7
votes

The inverse relation only holds when you have some parameter in the system that you can vary in order to get more/less results. Then there's a straightforward relationship: you lower the threshold to get more results and among them some are TPs and some FPs. This, actually, doesn't always mean that precision or recall will rise and fall simultaneously - the real relationship can be mapped using the ROC curve. As for Q2, likewise, in both of these tasks precision and recall are not necessarily inversely related.

So, how do you increase recall or precision, not impacting the other simultaneously? Usually, by improving the algorithm or model. I.e. when you just change parameters of a given model, the inverse relationship will usually hold, although you should mind that it will also be usually non-linear. But if you, for example, add more descriptive features to the model, you can increase both metrics at once.

1
votes

Regarding the first question, I interpret these concepts in terms of how restrictive your results must be.

If you're more restrictive, I mean, if you're more "demanding on the correctness" of the results, you want it to be more precise. For that, you might be willing to reject some correct results as long as everything you get is correct. Thus, you're raising your precision and lowering your recall. Conversely, if you do not mind getting some incorrect results as long as you get all the correct ones, you're raising your recall and lowering your precision.

On what concerns the second question, if I look at it from the point of view of the paragraphs above, I can say that yes, they are inversely related.

To the best of my knowledge, In order to be able to increase both, precision and recall, you'll need either, a better model (more suitable for your problem) or better data (or both, actually).