2
votes

I have a 4D Numpy array of shape (15, 2, 320, 320). In other words, each element of the [320 x 320] matrix is a matrix of size [15 x 2]. Now, I would like to compute the dot product for each element of the [320x320] matrix, then extract the diagonal array.

Currently, I am using 2 "for" loops, and the code works well (see the snippet). However, the calculation speed is too slow (when I process a large data). Anyone can show me how to vectorize the computation without the loops.

A = np.random.rand(15, 2, 320, 320)
B = np.zeros((2, 320, 320))  # store the final results
 
for row in range (320):
        for col in range (320):
            C = np.dot(A[:, :, row, col].T, A[:, :, row, col]) # the shape of C is [2 x 2]
            C = np.diag(C)
            B[:, row, col] = C

Any help or suggestions would be greatly appreciated!

3
Your first two sentences are contradicting each other. Please re-confirm whether the shape is (15,2,320,320) or (320,320,15,2)fountainhead
Also, what is the expected output shape. Your output produces a shape of (2,320,320)Akshay Sehgal
Thanks @fountainhead and @ Akshay Sehgal. The shape of the input array A is (15, 2, 320, 320). The expected shape of the output array B is (2, 320, 320)lehoanglittle

3 Answers

3
votes

It's kinda funny sometimes how tensor algebra works out. It's just (A*A).sum(0):

A = np.random.rand(15, 2, 320, 320)
B1 = np.zeros((2, 320, 320))  # store the final results
 
for row in range (320):
        for col in range (320):
            C = np.dot(A[:, :, row, col].T, A[:, :, row, col]) # the shape of C is [2 x 2]
            C = np.diag(C)
            B1[:, row, col] = C

B2 = (A*A).sum(0)

np.allclose(B1, B2)
True

(Credit to @Ehsan and @hpaulj who came up with the einsum answer that suggests this formulation.)

2
votes

EDIT: suggested by @hpaulj in comments, you can use a shorter version of the code mentioned below as this. This code multiplies ij of A elementwise and then sums it over i(a.k.a. rows) and returns it (which is equivalent to getting diagonal of dot product):

np.einsum('ijkl,ijkl->jkl',A,A)

You can vectorize it using one-liner as below. Of course, you can use combination of transpose and dot product to substitute the np.einsum, but I think indices in np.einsum makes code cleaner and more readable. np.diagonal takes the diagonal of first two axis and attaches as last dimension, hence, you need the transpose at the end to bring that dimension to the left where you want it:

np.diagonal(np.einsum('ijkl,imkl->jmkl',A,A)).transpose(2,0,1)

test the output matches your code's output B:

np.allclose(np.diagonal(np.einsum('ijkl,imkl->jmkl',A,A)).transpose(2,0,1), B)
#True
1
votes

This should work:

A_T_1 = np.transpose(A, (2,3,0,1))           # Shape (320,320,2,15)
A_T_2 = np.transpose(A, (2,3,1,0))           # Shape (320,320,15,2)
C_all = A_T_2 @ A_T_1                        # Shape (320,320,2,2)
D_all = np.diagonal(C_all, axis1=2, axis2=3) # Shape (320,320,2)

B = np.transpose(D_all, (2,0,1))             # Shape (2,320,320)

Note:

  • transpose() only returns a view of its input array, and doesn't do any expensive data copy.
  • Here you don't need to initialize B = np.zeros((2, 320, 320))