In the documentation of torch.autograd.grad
, it is stated that, for parameters,
parameters:
outputs (sequence of Tensor) – outputs of the differentiated function.
inputs (sequence of Tensor) – Inputs w.r.t. which the gradient will be returned (and not accumulated into .grad).
I try the following:
a = torch.rand(2, requires_grad=True)
b = torch.rand(2, requires_grad=True)
c = a+b
d = a-b
torch.autograd.grad([c, d], [a, b]) #ValueError: only one element tensors can be converted to Python scalars
torch.autograd.grad(torch.tensor([c, d]), torch.tensor([a, b])) #RuntimeError: grad can be implicitly created only for scalar outputs
I would like to get gradients of a list of tensors w.r.t another list of tensors. What is the correct way to feed the parameters?