3
votes

I am trying to implement face recognition by Principal Component Analysis (PCA) using python. One of the steps is to normalize the input (test) image T by subtracting the average face vector m: n = T - m.

Here is my code:

#Step1: put database images into a 2D array
filenames = glob.glob('C:\\Users\\Karim\\Downloads\\att_faces\\New folder/*.pgm')
filenames.sort()
img = [Image.open(fn).convert('L').resize((90, 90)) for fn in filenames]
images = np.asarray([np.array(im).flatten() for im in img])

#Step 2: find the mean image and the mean-shifted input images
m = images.mean(axis=0)
shifted_images = images - m

#Step 7: input image
input_image = Image.open('C:\\Users\\Karim\\Downloads\\att_faces\\1.pgm').convert('L').resize((90, 90))
T = np.asarray(input_image)
n = T - mean_image

But I am getting an error Traceback (most recent call last): File "C:/Users/Karim/Desktop/Bachelor 2/New folder/new3.py", line 46, in <module> n = T - m ValueError: operands could not be broadcast together with shapes (90,90) (8100)

2

2 Answers

3
votes

mean_image was computed for flattened arrays:

images = np.asarray([np.array(im).flatten() for im in img])
mean_image = images.mean(axis=0)

and input_image is 90x90. Hence the error. You should either flatten the input image, too, or not flatten the original images (I don't quite understand why you do it), or resize mean_image to 90x90 just for this operation.

3
votes

As @Lev says, you have flattened your arrays. You don't actually need to do this to perform the mean. Say you have an array of 2 3x4 images, then you'd have something like this:

In [291]: b = np.random.rand(2,3,4)

In [292]: b.shape
Out[292]: (2, 3, 4)

In [293]: b
Out[293]: 
array([[[ 0.18827554,  0.11340471,  0.45185287,  0.47889188],
        [ 0.35961448,  0.38316556,  0.73464482,  0.37597429],
        [ 0.81647845,  0.28128797,  0.33138755,  0.55403119]],

       [[ 0.92025024,  0.55916671,  0.23892798,  0.59253267],
        [ 0.15664109,  0.12457157,  0.28139198,  0.31634361],
        [ 0.33420446,  0.27599807,  0.40336601,  0.67738928]]])

Perform the mean over the first axis, leaving the shape of the arrays:

In [300]: b.mean(0)
Out[300]: 
array([[ 0.55426289,  0.33628571,  0.34539042,  0.53571227],
       [ 0.25812778,  0.25386857,  0.5080184 ,  0.34615895],
       [ 0.57534146,  0.27864302,  0.36737678,  0.61571023]])

In [301]: b - b.mean(0)
Out[301]: 
array([[[-0.36598735, -0.222881  ,  0.10646245, -0.0568204 ],
        [ 0.10148669,  0.129297  ,  0.22662642,  0.02981534],
        [ 0.24113699,  0.00264495, -0.03598923, -0.06167904]],

       [[ 0.36598735,  0.222881  , -0.10646245,  0.0568204 ],
        [-0.10148669, -0.129297  , -0.22662642, -0.02981534],
        [-0.24113699, -0.00264495,  0.03598923,  0.06167904]]])

For many uses, this will also be faster than keeping your images as a list of arrays, since the numpy operations are done on one array instead of through a list of arrays. Most methods, like mean, cov, etc accept the axis argument, and you can list all the dimensions to perform it on without having to flatten.

To apply this to your script, I would do something like this, keeping the original dimensionalities:

images = np.asarray([Image.open(fn).convert('L').resize((90, 90)) for fn in filenames])
# so images.shape = (len(filenames), 90, 90)

m = images.mean(0)
# numpy broadcasting will automatically subract the (90, 90) mean image from each of the `images`
# m.shape = (90, 90)
# shifted_images.shape = images.shape = (len(filenames), 90, 90)
shifted_images = images - m 

#Step 7: input image
input_image = Image.open(...).convert('L').resize((90, 90))
T = np.asarray(input_image)
n = T - m

As a final comment, if speed is an issue, it would be faster to use np.dstack to join your images:

In [354]: timeit b = np.asarray([np.empty((50,100)) for i in xrange(1000)])
1 loops, best of 3: 824 ms per loop

In [355]: timeit b = np.dstack([np.empty((50,100)) for i in xrange(1000)]).transpose(2,0,1)
10 loops, best of 3: 118 ms per loop

But it's likely that loading the images takes most of the time, and if that's the case you can ignore this.