18
votes

I am analyzing multiple images and need to be able to tell if they are shifted compared to a reference image. The purpose is to tell if the camera moved at all in between capturing images. I would ideally like to be able to correct the shift in order to still do the analysis, but at a minimum I need to be able to determine if an image is shifted and discard it if it's beyond a certain threshold.

Here are some examples of the shifts in an image I would like to detect:

reference imageshifted image 1shifted image 2

I will use the first image as a reference and then compare all of the following images to it to figure out if they are shifted. The images are gray-scale (they are just displayed in color using a heat-map) and are stored in a 2-D numpy array. Any ideas how I can do this? I would prefer to use the packages I already have installed (scipy, numpy, PIL, matplotlib).

3
I believe you're looking for cross-correlation.Lukas Graf

3 Answers

22
votes

As Lukas Graf hints, you are looking for cross-correlation. It works well, if:

  1. The scale of your images does not change considerably.
  2. There is no rotation change in the images.
  3. There is no significant illumination change in the images.

For plain translations cross-correlation is very good.

The simplest cross-correlation tool is scipy.signal.correlate. However, it uses the trivial method for cross-correlation, which is O(n^4) for a two-dimensional image with side length n. In practice, with your images it'll take very long.

The better too is scipy.signal.fftconvolve as convolution and correlation are closely related.

Something like this:

import numpy as np
import scipy.signal

def cross_image(im1, im2):
   # get rid of the color channels by performing a grayscale transform
   # the type cast into 'float' is to avoid overflows
   im1_gray = np.sum(im1.astype('float'), axis=2)
   im2_gray = np.sum(im2.astype('float'), axis=2)

   # get rid of the averages, otherwise the results are not good
   im1_gray -= np.mean(im1_gray)
   im2_gray -= np.mean(im2_gray)

   # calculate the correlation image; note the flipping of onw of the images
   return scipy.signal.fftconvolve(im1_gray, im2_gray[::-1,::-1], mode='same')

The funny-looking indexing of im2_gray[::-1,::-1] rotates it by 180° (mirrors both horizontally and vertically). This is the difference between convolution and correlation, correlation is a convolution with the second signal mirrored.

Now if we just correlate the first (topmost) image with itself, we get:

enter image description here

This gives a measure of self-similarity of the image. The brightest spot is at (201, 200), which is in the center for the (402, 400) image.

The brightest spot coordinates can be found:

np.unravel_index(np.argmax(corr_img), corr_img.shape)

The linear position of the brightest pixel is returned by argmax, but it has to be converted back into the 2D coordinates with unravel_index.

Next, we try the same by correlating the first image with the second image:

enter image description here

The correlation image looks similar, but the best correlation has moved to (149,200), i.e. 52 pixels upwards in the image. This is the offset between the two images.


This seems to work with these simple images. However, there may be false correlation peaks, as well, and any of the problems outlined in the beginning of this answer may ruin the results.

In any case you should consider using a windowing function. The choice of the function is not that important, as long as something is used. Also, if you have problems with small rotation or scale changes, try correlating several small areas agains the surrounding image. That will give you different displacements at different positions of the image.

1
votes

Another way to solve it is to compute sift points in both images, use RANSAC to get rid of outliers and then solve for translation using a least squares estimator.

0
votes

as Bharat said as well another is using sift features and Ransac:

import numpy as np
import cv2
from matplotlib import pyplot as plt

def crop_region(path, c_p):
    """
      This function crop the match region in the input image
      c_p: corner points
    """    
    # 3 or 4 channel as the original
    img = cv2.imread(path, -1)

    # mask 
    mask = np.zeros(img.shape, dtype=np.uint8) 

    # fill the the match region 
    channel_count = img.shape[2]  
    ignore_mask_color = (255,)*channel_count
    cv2.fillPoly(mask, c_p, ignore_mask_color)

    # apply the mask
    matched_region = cv2.bitwise_and(img, mask)

    return matched_region

def features_matching(path_temp,path_train):
    """
          Function for Feature Matching + Perspective Transformation
    """       
    img1 = cv2.imread(path_temp, 0)   # template
    img2 = cv2.imread(path_train, 0)   # input image

    min_match=10

    # SIFT detector
    sift = cv2.xfeatures2d.SIFT_create()

    # extract the keypoints and descriptors with SIFT

    kps1, des1 = sift.detectAndCompute(img1,None)
    kps2, des2 = sift.detectAndCompute(img2,None)

    FLANN_INDEX_KDTREE = 0
    index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
    search_params = dict(checks = 50)

    flann = cv2.FlannBasedMatcher(index_params, search_params)

    matches = flann.knnMatch(des1, des2, k=2)

    # store all the good matches (g_matches) as per Lowe's ratio 
    g_match = []
    for m,n in matches:
        if m.distance < 0.7 * n.distance:
            g_match.append(m)
    if len(g_match)>min_match:
        src_pts = np.float32([ kps1[m.queryIdx].pt for m in g_match ]).reshape(-1,1,2)
        dst_pts = np.float32([ kps2[m.trainIdx].pt for m in g_match ]).reshape(-1,1,2)

        M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC,5.0)
        matchesMask = mask.ravel().tolist()

        h,w = img1.shape
        pts = np.float32([ [0,0],[0,h-1],[w-1,h-1],[w-1,0] ]).reshape(-1,1,2)
        dst = cv2.perspectiveTransform(pts,M)

        img2 = cv2.polylines(img2, [np.int32(dst)], True, (0,255,255) , 3, cv2.LINE_AA)

    else:
        print "Not enough matches have been found! - %d/%d" % (len(g_match), min_match)
        matchesMask = None

    draw_params = dict(matchColor = (0,255,255), 
                       singlePointColor = (0,255,0),
                       matchesMask = matchesMask, # only inliers
                       flags = 2)
    # region corners    
    cpoints=np.int32(dst)
    a, b,c = cpoints.shape

    # reshape to standard format
    c_p=cpoints.reshape((b,a,c))  

    # crop matching region
    matching_region = crop_region(path_train, c_p)

    img3 = cv2.drawMatches(img1, kps1, img2, kps2, g_match, None, **draw_params)
    return (img3,matching_region)