0
votes

I have a 4x4x256 tensor and a 128x256 matrix. I need to multiply each 256-d depth-wise vector of the tensor by the matrix, such that I get a 4x4x128 tensor as a result.

Working in Numpy it's not clear to me how to do this. In their current shape it doesn't look like any variant of np.dot exists to do this. Manipulating the shapes to take advantage of broadcasting rules doesn't seem to provide any help. np.tensordot and np.einsum may be useful but looking at the documentation is going right over my head.

Is there an efficient way to do this?

1

1 Answers

1
votes

You can use np.einsum to do this operation. An example with random values:

a = np.arange(4096.).reshape(4,4,256)
b = np.arange(32768.).reshape(128,256)
c = np.einsum('ijk,lk->ijl',a,b)
print(c.shape)

Here, the subscripts argument is: ijk,lk->ijl
From your requirement, i=4, j=4, k=256, l=128
The comma separates the subscripts for two operands, and the subscripts state that the multiplication should be performed over the last subscript in each tensor (the subscript k which is common to both the tensors).

The tensor subscript after the -> states that the resultant tensor should have the shape (i,j,l). Now depending on the type of operation you are performing, you might have to retain this subscript or change this subscript to jil, but the rest of the subscripts remains the same.