I'm reading a lot about Precision-Recall curves in order to evaluate my image retrieval system. In particular I'm reading this article about feature extractors in VLFeat and the wikipedia page about precision-recall.
I understand that this curve is useful to evaluate our system performance w.r.t. the number of elements retrieved. So we repeatedly compute precision-recall retrieving the top element, then top 2, top 3 and so on...but my question is: when do we stop?
My intuition is: we stop when our list of retrieved elements has recall equal to 1, so we retrieve all the relevant elements (i.e. there are no false negatives, only true positives).
Same question is for average precision: how many elements should be present in the retrieved result for computing it? If my previous intuition is correct, then we just need to find out what is the smallest list s.t. recall is 1 and use it for compute it AP.
I wonder why all the libraries for computing p-r curve don't show how this is implemented?