0
votes

I built an autoencoder with some convolutional layers, a fully connected layer in the middle and some deconvolutional layers. I use Atari Frames as input (210x160x3 dimension). My problem at the moment is that I can only process squared images (same side length on all sides). When I use the original side lengths this error occurs:

UserWarning: Using a target size (torch.Size([30, 3, 210, 160])) that is different to the input size (torch.Size([30, 3, 158, 158])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  return F.mse_loss(input, target, reduction=self.reduction)

I think the problem originates in the change from linear to the deconv Layer, because the tensor size always gets quadratic:

Linear: torch.Size([30, 1000]) Deconv: torch.Size([30, 1000, 1, 1])

However, I want a torch.Size with the shape like: torch.Size([30,1000,11,8]) in the deconv layer.

I am pretty sure there is a simple solution but I didn't find it yet.

Code snippet of the forward function:

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.relu(self.conv2(x))
        x = F.relu(self.conv3(x))
        x = F.relu(self.conv4(x))

        x = x.view(x.size(0), -1)
        x = torch.sigmoid(self.lv(x))
        x = x.unsqueeze(-1).unsqueeze(-1)

        x = F.relu(self.deconv1(x))
        x = F.relu(self.deconv2(x))
        x = F.relu(self.deconv3(x))
        x = F.relu(self.deconv4(x))
        x = F.relu(self.deconv5(x))
        return x

Thank you so much in advance!

1

1 Answers

0
votes

I solved it! This line:

x = x.view(x.size(0), -1, 11, 8)

instead of:

x = x.unsqueeze(-1).unsqueeze(-1)

will do the job.