10
votes

I am trying to extract the weights from a linear layer, but they do not appear to change, although error is dropping monotonously (i.e. training is happening). Printing the weights' sum, nothing happens because it stays constant:

np.sum(model.fc2.weight.data.numpy())

Here are the code snippets:

def train(epochs):
    model.train()
    for epoch in range(1, epochs+1):
        # Train on train set
        print(np.sum(model.fc2.weight.data.numpy()))
        for batch_idx, (data, target) in enumerate(train_loader):
            data, target = Variable(data), Variable(data)
            optimizer.zero_grad()
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()

and

# Define model
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(100, 80, bias=False)
        init.normal(self.fc1.weight, mean=0, std=1)
        self.fc2 = nn.Linear(80, 87)
        self.fc3 = nn.Linear(87, 94)
        self.fc4 = nn.Linear(94, 100)

    def forward(self, x):
        x = self.fc1(x)
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        x = F.relu(self.fc4(x))
        return x

Maybe I am looking on the wrong parameters, although I checked the docs. Thanks for your help!

1
Have you checked whether the gradient with respect to the variable is changing?? You can use register_hook() function on the variable for this? - Kashyap
I found the error. Both variables pointed to the same memory... Sorry! BTW: Nice function! - N8_Coder

1 Answers

17
votes

Use model.parameters() to get trainable weight for any model or layer. Remember to put it inside list(), or you cannot print it out.

The following code snip worked

>>> import torch
>>> import torch.nn as nn
>>> l = nn.Linear(3,5)
>>> w = list(l.parameters())
>>> w