I built an autoencoder with some convolutional layers, a fully connected layer in the middle and some deconvolutional layers. I use Atari Frames as input (210x160x3 dimension). My problem at the moment is that I can only process squared images (same side length on all sides). When I use the original side lengths this error occurs:
UserWarning: Using a target size (torch.Size([30, 3, 210, 160])) that is different to the input size (torch.Size([30, 3, 158, 158])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
return F.mse_loss(input, target, reduction=self.reduction)
I think the problem originates in the change from linear to the deconv Layer, because the tensor size always gets quadratic:
Linear: torch.Size([30, 1000]) Deconv: torch.Size([30, 1000, 1, 1])
However, I want a torch.Size with the shape like: torch.Size([30,1000,11,8]) in the deconv layer.
I am pretty sure there is a simple solution but I didn't find it yet.
Code snippet of the forward function:
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.relu(self.conv2(x))
x = F.relu(self.conv3(x))
x = F.relu(self.conv4(x))
x = x.view(x.size(0), -1)
x = torch.sigmoid(self.lv(x))
x = x.unsqueeze(-1).unsqueeze(-1)
x = F.relu(self.deconv1(x))
x = F.relu(self.deconv2(x))
x = F.relu(self.deconv3(x))
x = F.relu(self.deconv4(x))
x = F.relu(self.deconv5(x))
return x
Thank you so much in advance!