8
votes

I am trying to create a C++ program where there are a lot of images in a list compared to one input image. I got the whole thing working and the program is creating DMatch matches.

Now I am trying to determine which of the list of images that is compared to the source image is the best match. I first tried to do this with just comparing how many matches there where between the images, but the problem is that when a generated image has a lot of key-points; they also tend to have a lot of matches, at least in my program.

So how can I determine which of the array of images is the best match to the source image? I am using this loop to determine the matches but it doesn't really work:

vector< vector<DMatch> > filteredMatches;
vector<int> goodIds;
Ptr<DescriptorMatcher> matcher(new BFMatcher(NORM_HAMMING, false));

printf("bad matches: ");

for(size_t i = 0; i < images.size();i++){
    vector<DMatch> matches, good_matches;

    matcher->clear();
    matcher->match(images[i], tex_des, matches);
    if(matches.size() < 8){
        printf("F%d,", (int)i + 1);
        continue;
    }

    double min_dist = 100;

    for(size_t j = 0; j < matches.size(); j++ ){ 
        double dist = matches[j].distance;
        if( dist < min_dist ) 
            min_dist = dist;
    }

    if(min_dist > 50.0){
        printf("D%d,", (int)i + 1);
        continue;
    }

    double good_dist = min_dist * 3;
    for(size_t j = 0; j < matches.size(); j++ ){
        if(matches[j].distance < good_dist)
            good_matches.push_back(matches[j]);
    }

    size_t size = good_matches.size();
    if(size < 8){
        printf("M%d,", (int)i + 1);
        continue;
    }

    vector<Point2f> srcPoints(size);
    vector<Point2f> dstPoints(size);

    for(size_t j = 0; j < size; j++){
        srcPoints[j] = destination[good_matches[j].trainIdx].pt;    
        dstPoints[j] = keyPoints[i][good_matches[j].queryIdx].pt;   
    }

    vector<unsigned char> inliersMask(srcPoints.size());
    Mat H = findHomography(srcPoints, dstPoints, CV_FM_RANSAC, 3.0, inliersMask);

    vector<DMatch> inliers;
    for(size_t j = 0; j < inliersMask.size(); j++){
        if(inliersMask[j]){
            inliers.push_back(good_matches[j]);
        }
    }
    if(inliers.size() < 4){
        printf("S%d,", (int)i + 1);
        continue;
    }

    filteredMatches.push_back(inliers);
    goodIds.push_back((int)i);

    H.release();
}

printf(" good matches: ");

int best = -1;
int amount = 0;
for(size_t i = 0; i < filteredMatches.size(); i++){
    int size = (int)filteredMatches.at(i).size();
    if(size < 8) continue;

    printf("%d,", goodIds[i] + 1);

    if(amount < size){
        amount = size;
        best = i;
    }
}

if(best >= 0) printf(" best match on image: %d, keypoints: %d, ", goodIds[best] + 1, amount);

If someone can point me to the functions or the logic I must use I would greatly appreciate it!

4
What is the problem with saying "The match with the most inliers is my best match."?Tobias Hermann
That was the first thing I thought of too, but when I tried it I didn't get any accurate results.tversteeg
OK, so the images (even the false ones) are quite similar I guess. How is the accuracy if you take the average distance of all matches or of all inliers? Can you post example pictures where this fails?Tobias Hermann
Have you found an acceptable solution since this post? I'm very interested by a solution, I have a similar issue to solve quickly.Spawnrider
No not really, I did get it to work, but it was really slow for multiple images and it didn't scale well. So I chose to use a third party solution.tversteeg

4 Answers

1
votes

It depends what are the image in the list. You can't have one solution for every vision problem in the world. For example, the project I work on needs to recognize material in pictures of walls. You can't just compare it to different picture of wall with different material and hope to get a match.

In my case, I needed to create descriptors. Descriptors are algorithm that output value that can be compared to other value of another picture. There are a lot of descriptors already available in openCV like LBP, SURF, etc.. To put it simply, you don't compare the image anymore, you compare the output value of the descriptor of image 1 to the descriptor value of all the image in the list.

You need to pick up the descriptors that your eyes/brain use to find a match in real life. For example, if the matching is based on color, you could use CLD or DCD. If the matching is based on texture, use LBP. You can also do like I did in my project and use a lot of descriptor and use Machine Learning with trained data algorithms to find the best match.

So, to summarize, there is no silver bullet that can fix all vision problem. You need to adapt your solution to the problem.

Hope it helps!

1
votes

There isn't any straightforward answer. For better results, you have to implement some sort of transform and do clustering on the transformed map instead of just summing up the distances. That's difficult and even publishable.

Otherwise, you'll have to use more practical techniques like dimensional and histogram filtering. You can have a look at OpenCV's stitcher, isolate the module you're interested, and customize the source code to your needs.

0
votes

You should pick just really stable correspondences. I would recommend to read: OpenCV 2 Computer Vision Application Programming Cookbook - Chapter 9 - Matching images using random sample consensus (http://opencv-cookbook.googlecode.com/svn/trunk/Chapter%2009/).

0
votes

a short search for your problem supplied me with following entry in the opencv answer sections:

/CV Answer forum

which seems to supply the answer to the question you seem to be having. To filter the results you get as suggested in the answer i would take a look at the RANSAC algorithm, to find the best results in your match selection.

RANSAC desctiption Wikipedia

At least this should point you in the right direction.