0
votes

A pytorch question, regarding backward(). In the pytorch blitz tutorial copied and pasted below, they pass in a vector [0.1, 1.0, 0.0001] to backward() . I can intuitively guess why vector [0.1, 1.0, 0.0001] shape passed in is [3] , but I do not understand where the values 0.1, 1.0, 0.0001 come from. Another tutorial I looked at passes in one such that backwards on a vector is done like this : L.backward(torch.ones(L.shape))

# copied from blitz tutorial
Now in this case y is no longer a scalar. torch.autograd could not compute the full Jacobian directly, but if we just want the vector-Jacobian product, simply pass the vector to backward as argument:

v = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
y.backward(v)

print(x.grad)

If anyone can explain the reasoning for [0.1, 1.0, 0.0001], I would appreciate it.

1
To the best of my knowledge, they are arbitrary numbers. In a tutorial, it might be better to give different non-random numbers such that the readers can notice that the output is proportional to the vector. - Berriel

1 Answers

1
votes

As the document says, implicitly, grad cannot be created for non-scalar outputs. y is non-scalar tensor, and you cannot y.backward() directly. But you can pass in a vector to backward to get the vector-Jacobian product. If you don't want to change the grads, you can pass in a vector with all elements are ones.

x = torch.tensor([2.,3.,4.], requires_grad=True)
y = x**2

y.backward() # error

y.backward(torch.tensor([1.,1.,1.])) # work
x.grad # tensor([4.,6.,8.])

# y.backward(torch.tensor([2.,2.,2.])) # change the passed vector. 
# x.grad # tensor([8.,12.,16])