Efficient pytorch broadcasting command not obtained

Question

I have the class-wise feature vector for the 5 classes in my model. the feature vectors for each class are 20 dimensional. I want to multiply a scalar gain to each class's feature vector and the resulting weighted feature vectors are summed to form new feature matrix.

The gain_matrix corresponds to a scalar value for each i-th j-th pair of classes. The feature vector (20 dimensional) of i-th class is calculated as the sum of the scalar gain multiplied by all other classes feature vectors. The exact implementation code is shown below.

nClass=5
feature_dim=20

gain_matrix=torch.rand(nClass,nClass)
feature_matrix=torch.rand(nClass,feature_dim) #in my implementation this is output from model
feature_matrix_new=torch.zeros(nClass,feature_dim)

for i in range(nClass):
    for j in range(nClass):
            feature_matrix_new[i,:]+=gain_matrix[i][j]*feature_matrix[j,:]

The nested for loop is slowing down the implementation a lot.

Is there any efficient PyTorch broadcasting solution to avoid the nested for loop in my implementation?

I have seen pytorch broadcasting web page but it did not help me much.

Ivan Ivan · Accepted Answer · 2021-08-16T07:59:41

This would be a good place to use torch.einsum:

>>> feature_matrix_new = torch.einsum('ij,jk->ik', gain_matrix, feature_matrix)

However in this case this just comes down to a matrix multiplication:

>>> feature_matrix_new = gain_matrix @ feature_matrix

Efficient pytorch broadcasting command not obtained

1 Answers