2
votes

I'm trying to speed up my code to perform some numerical calculations where I need to multiply 3 matrices with an array. The structure of the problem is the following:

  • The array as a shape of (N, 10)
  • The first matrix is constant along the dynamic dimension of the array and has a shape of (10, 10)
  • The other two matrices vary along the first dimension of the array and have a (N, 10, 10) shape
  • The result of the calculation should be an array with (N, shape)

I've implemented a solution using for loops that is working, but I'd like to have a better performance so I'm trying to use the numpy functions. I've tried using numpy.tensordot but when multiplying the dynamic matrices with the array I get a shape of (N, 10, N) instead of (N, 10)

My for loop is the following:

res = np.zeros(temp_rho.shape, dtype=np.complex128)
for i in range(temp_rho.shape[0]):
    res[i] = np.dot(self.constMatrix, temp_rho[i])  
    res[i] += np.dot(self.dinMat1[i], temp_rho[i])
    res[i] += np.dot(self.dinMat2[i], np.conj(temp_rho[i]))
#temp_rho.shape = (N, 10)
#res.shape = (N, 10)
#self.constMatrix.shape = (10, 10)
#self.dinMat1.shape = (N, 10, 10)
#self.dinMat2.shape = (N, 10, 10)

How should this code be implemented dot products of numpy, returning the correct dimensions?

1

1 Answers

2
votes

Here's an approach using a combination of np.dot and np.einsum -

parte1 = constMatrix.dot(temp_rho.T).T
parte2 = np.einsum('ijk,ik->ij',dinMat1, temp_rho)
parte3 = np.einsum('ijk,ik->ij',dinMat2, np.conj(temp_rho))
out = parte1 + parte2 + parte3

Alternative way to get parte1 would be with np.tensordot -

parte1 = np.tensordot(temp_rho, constMatrix, axes=([1],[1]))

Why doesn't numpy.tensordot work for the later two sum-reductions?

Well, we need to keep the first axis between dinMat1/dinMat2 aligned against the first axis of temp_rho/np.conj(temp_rho), which isn't possible with tensordot as the axes that are not sum-reduced are elementwise multiplied along two separate axes. Therefore, when used with np.tensordot, we would end up with two axes of length N corresponding to the first axis each from the two inputs.