Translating 3D CNN from Keras to Pytorch

Question

I'm trying to translate the below 3D CNN architecture from keras to pytorch. The 3D images all have the following dimensions: 193 x 229 x 193.

Network architecture in Keras:

def test_model(size):
    model = Sequential()
    model.add(Conv3D(filters=8,
                     kernel_size=(3, 3, 3),
                     activation='relu',
                     input_shape=size,
                     name="conv_1_1"))
    model.add(Conv3D(filters=8,
                     kernel_size=(3, 3, 3),
                     activation='relu',
                     name="conv_1_2"))
    model.add(MaxPooling3D(pool_size=(2, 2, 2),
                           strides=(2, 2, 2)))
    model.add(Conv3D(filters=16,
                     kernel_size=(3, 3, 3),
                     activation='relu',
                     input_shape=size,
                     name="conv_2_1"))
    model.add(Conv3D(filters=16,
                     kernel_size=(3, 3, 3),
                     activation='relu',
                     name="conv_2_2"))
    model.add(MaxPooling3D(pool_size=(2, 2, 2),
                           strides=(2, 2, 2)))
    model.add(Flatten())
    model.add(Dense(units=1,
                    name="d_2"))
    return model

My attempt at translating this to Pytorch:

class Model(torch.nn.Module): 
    
    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = nn.Conv3d(input_channel=1, output_channel=8, kernel_size=3)
        self.conv2 = nn.Conv3d(input_channel=8, output_channel=8, kernel_size=3)
        self.conv3 = nn.Conv3d(input_channel=8, output_channel=16, kernel_size=3)
        self.conv4 = nn.Conv3d(input_channel=16, output_channel=16, kernel_size=3)
        self.fc1 = nn.Linear( ???? , 1)

     def forward (self, x):
        x = F.relu(self.conv1(x))
        x = F.relu(F.max_pool3d(self.conv2(x), kernel_size=2, stride=2))
        x = F.relu(self.conv3(x))
        x = F.relu(F.max_pool3d(self.conv4(x), kernel_size=2, stride=2))
        x = self.fc1

        return x
    
net = Model()

Please could you let me know where I've made mistakes and also clarify how to determine the input for the nn.Linear( ???? , 2) layer? Thank you for your help!

Ivan Ivan · Accepted Answer · 2021-01-22T18:32:52

Inspecting your last nn.Conv3d's output, you have a tensor shape of (-1, 16, 45, 54, 45). Therefore, you need a total of 16*45*54*45=1749600 connections on your dense layer (this is tremendously large!).

Some other things to point out:

input_channel and output_channels should be in_channels and out_channels, respectively.
You can either use torch.flatten(x, start_dim=1) or a nn.Flatten() layer (which will flatten from axis=1 to axis=-1 by default).
you have misplaced an F.relu activation as your overall structure is [conv3d, relu, conv3d, relu, maxpool3d] + [conv3d, relu, conv3d, relu, maxpool3d] + [flatten, dense].

Resulting code:

class Model(torch.nn.Module): 
    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = nn.Conv3d(in_channels=1, out_channels=8, kernel_size=3)
        self.conv2 = nn.Conv3d(in_channels=8, out_channels=8, kernel_size=3)
        self.conv3 = nn.Conv3d(in_channels=8, out_channels=16, kernel_size=3)
        self.conv4 = nn.Conv3d(in_channels=16, out_channels=16, kernel_size=3)
        self.fc1 = nn.Linear(1749600, 1)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.relu(self.conv2(x))
        x = F.max_pool3d(x, kernel_size=2, stride=2)

        x = F.relu(self.conv3(x))
        x = F.relu(self.conv4(x))
        x = F.max_pool3d(x, kernel_size=2, stride=2)

        x = torch.flatten(x, start_dim=1)
        x = self.fc1(x)
        return x

Translating 3D CNN from Keras to Pytorch

1 Answers