1
votes

I'm trying to find all pictures that are "almost" identical in a folder. By "almost identical", I mean for example you have an original picture, and you have modifications of this picture. The modifications can be change of resolution, picture changed to greyscale, picture was cropped, picture rotated, a frame or some text was added, picture was mirrored...

I'm using OpenCV with SIFT and ORB (I choose each what method I want to use, I don't use them at the same time). For all the picture variations, both SIFT and ORB work quite well. But not for the mirror picture. Even if I only make a mirror image of the first picture (meaning I don't change anything else), the score is about 10%.

I don't understand, as I thought that SIFT and ORB were calculating distances of keypoints. But when taking a mirror image, the distances don't change. Only the direction.

What am I missing ?

Here is the extract from my code :

if method   == 'ORB':
    finder = cv2.ORB_create()
elif method == 'SIFT':
    finder = cv2.xfeatures2d.SIFT_create()

lowe_ration = 0.86

# find the keypoints and descriptors with SIFT or ORB
for i in range(0,count-1):
    for j in range(i+1,count):
        kp1, des1 = finder.detectAndCompute(all_new_images_to_compare[i],None)
        kp2, des2 = finder.detectAndCompute(all_new_images_to_compare[j],None)
        bf = cv2.BFMatcher()
        matches = bf.knnMatch(des1,des2, k=2)        
        good_points = []
        for m,n in matches:
            if m.distance < lowe_ratio * n.distance:
       good_points.append(m)
       number_keypoints = 0
       if len(kp1) >= len(kp2):
           number_keypoints = len(kp1)
       else:
           number_keypoints = len(kp2)
       percentage_similarity = len(good_points) / (number_keypoints) * 100
       
        if  (percentage_similarity)>=10:
            myfile1=open("C:/Users/ABC/Documents/Find-Similar-Pictures/results_" + method + ".txt","a")
            myfile1.write(str(titles[i]) + "\t" + str(titles[j]) + "\t" + method + " (" + str(lowe_ratio) + ") \t" + str(int(percentage_similarity) ) + "\t\n")
            myfile1.close()
            print(titles[i], titles[j],"== Similarity: " + str(int(percentage_similarity)), method + " (" + str(lowe_ratio) + ")")
    print("___done with file", titles[i])
print("=====done=====")

Thanks a lot for your help

1

1 Answers

2
votes

You've discovered a property of ORB descriptor and SIFT's detector - neither is invariant to reflection.

If you're interested in matching reflected images, you'll need to do one of the following :

  1. Use a symmetric keypoint detector, e.g. FAST, Harris or CeNSURE and then use the SIFT descriptor on the reflected keypoint.
  2. implement a reflection invariant descriptor like MBR-SIFT.

There's a good analysis of the problem given in this paper: Symmetric Stability of Low Level Feature Detectors

Best of luck!