0
votes

I have pytorch Tensor with shape (batch_size, step, vec_size), for example, a Tensor(32, 64, 128), let's call it A.

I have another Tensor(batch_size, vec_size), e.g. Tensor(32, 128), let's call it B.

I want to insert B into a certain position at axis 1 of A. The insert positions are given in a Tensor(batch_size), named P.

I understand there is no Empty tensor(like an empty list) in pytorch, so, I initialize A as zeros, and add B at a certain position at axis 1 of A.

A = Variable(torch.zeros(batch_size, step, vec_size))

What I'm doing is like:

for i in range(batch_size):
    pos = P[i]
    A[i][pos] = A[i][pos] + B[i]

But I get an Error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

Then, I make a clone of A each inside the loop:

for i in range(batch_size):
    A_clone =  A.clone()
    pos = P[i]
    A_clone[i][pos] = A_clone[i][pos] + B[i]

This is very slow for autograd, I wonder if there any better solutions? Thank you.

1

1 Answers

0
votes

You can use a mask instead of cloning.

See the code below

# setup
batch, step, vec_size = 64, 10, 128 
A = torch.rand((batch, step, vec_size))
B = torch.rand((batch, vec_size))
pos = torch.randint(10, (64,)).long()

# computations
# create a mask where pos is 0 if it is to be replaced
mask = torch.ones( (batch, step)).view(batch,step,1).float()
mask[torch.arange(batch), pos]=0

# expand B to have same dimension as A and compute the result
result = A*mask + B.unsqueeze(dim=1).expand([-1, step, -1])*(1-mask)

This way you avoid using for loops and cloning as well.