I think I found how to do it by einsum and it is much faster.
Let's generalize the input A and B by below
na = 2
nb = 3
nx = 50
ny = 60
A = np.random.random((na,nx,ny))
B = np.random.random((nb,nx,ny))
By Einstein notation, the operation could be written as this einsum function:
def f_einstein(A,B):
a = A[:,np.newaxis, :,:]
b = B[np.newaxis,:, :,:]
einstein = np.einsum( 'ikxy,kjxy->ijxy' , a , b )
return einstein.reshape( na*nb,nx,ny)
Thanks for @Bobby Ocean, the iterative method is :
import itertools as it
def f_iter(A,B):
return np.stack([a*b for a,b in it.product(A,B)])
To verify, the below return True
result_eintein = f_einstein(A,B)
result_iter = f_iter(A,B)
(result_eintein==result_iter).all()
The performance of einsum vs iter can be 1.5 - 6 times faster
when na = 2, nb = 3. It's 1.5 time faster
%timeit result_eintein = f_einstein(A,B)
The slowest run took 6.33 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 34.3 µs per loop
%timeit result_iter = f_iter(A,B)
The slowest run took 11.74 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 51.3 µs per loop
when na = 200, nb = 300. It's 6 time faster
%timeit result_eintein = f_einstein(A,B)
1 loop, best of 3: 1.2 s per loop
%timeit result_iter = f_iter(A,B)
1 loop, best of 3: 7.41 s per loop
My comments on it to help people understanding:
The operation I am doing is actually something like Batch-Outer-Hadamard-Product. It is batched by the outer product of the first dimension of A and B, followed by Hadamard product which is an elementwise product of the [50,60] matrices.
Because it involves a outer product like operation, the axis of the original A and B has to be in different dimension. That's why after introduce new axis for the outer product to work:
a = A[:,np.newaxis, :,:]
b = B[np.newaxis,:, :,:]
The new shape is now [2,1,50,60] and [1,3,50,60]. Matrices a and b can do an outer product by einsum easily on the first 2 dimensions. In (partial) einstein notation it is just 'ik,kj->ij'.
For the last hadamard product, the (partial) einstein notation is simply 'xy,xy->xy'.
Combining the two, the final einstein notation is 'ikxy,kjxy->ijxy'. The only last operation needed for einsum is to reshape from [i,j] to [i*j].
Credit to Tim Rocktäschel : I read a very good page from Tim Rocktäschel https://rockt.github.io/2018/04/30/einsum . The way he colors the matrices make me very easy to understand einsum. And yes, I want to master the einsum for my deep learning projects