1
votes

I'm trying to create a 3 Dimensional Convolutional Neural Network AutoEncoder. I'm unable to match my input dimension of the tensor with the output

I have tried changing the layer shape and using Keras autoencoder.

        padding = 'SAME'
        stride = [1,1,1]

        self.inputs_ = tf.placeholder(tf.float32, input_shape, name='inputs')
        self.targets_ = tf.placeholder(tf.float32, input_shape, name='targets')

        conv1 = tf.layers.conv3d(inputs= self.inputs_, filters=16, kernel_size=(3,3,3), padding= padding, strides = stride, activation=tf.nn.relu)  
        maxpool1 = tf.layers.max_pooling3d(conv1, pool_size=(2,2,2), strides=(2,2,2), padding= padding)
        conv2 = tf.layers.conv3d(inputs=maxpool1, filters=32, kernel_size=(3,3,3), padding= padding, strides = stride, activation=tf.nn.relu)
        maxpool2 = tf.layers.max_pooling3d(conv2, pool_size=(3,3,3), strides=(3,3,3), padding= padding)
        conv3 = tf.layers.conv3d(inputs=maxpool2, filters=96, kernel_size=(2,2,2), padding= padding , strides = stride, activation=tf.nn.relu)
        maxpool3 = tf.layers.max_pooling3d(conv3, pool_size=(2,2,2), strides=(2,2,2), padding= padding)
        #latent internal representation

        #decoder
#         tf.keras.layers.UpSampling3D()
        unpool1 =K.resize_volumes(maxpool3,2,2,2,"channels_last")
        deconv1 = tf.layers.conv3d_transpose(inputs=unpool1, filters=96, kernel_size=(2,2,2), padding= padding , strides = stride, activation=tf.nn.relu)
        unpool2 = K.resize_volumes(deconv1,3,3,3,"channels_last")
        deconv2 = tf.layers.conv3d_transpose(inputs=unpool2, filters=32, kernel_size=(3,3,3), padding= padding , strides = stride, activation=tf.nn.relu)
        unpool3 = K.resize_volumes(deconv2,2,2,2,"channels_last")
        deconv3 = tf.layers.conv3d_transpose(inputs=unpool3, filters=16, kernel_size=(3,3,3), padding= padding , strides = stride, activation=tf.nn.relu)
        self.output = tf.layers.dense(inputs=deconv3, units=3)
        self.output = tf.reshape(self.output, self.input_shape)

ValueError: Cannot reshape a tensor with 1850688 elements to shape [1,31,73,201,3] (1364589 elements) for 'Reshape' (op: 'Reshape') with input shapes: [1,36,84,204,3], [5] and with input tensors computed as partial shapes: input[1] = [1,31,73,201,3].

1

1 Answers

0
votes

Your input shape is [1, 31, 73, 201, 3]. During the transpose convolution, you are performing upscaling of [2,2,2], [3,3,3] and [2,2,2] in your three resize_volumes layers. If you multiply these numbers across the axis it will be [12, 12, 12] (each one 2*3*2). So, the output of the decoder will be multiple of 12 in each dimension.

But your input dimension shape of [x, 31, 73, 201, x] are not multiples of 12. Nearest multiples greater than those dimensions are [x, 36, 84, 204, x]. So, the solution will be either after the decoding part you will have strip out the excess dimension and match it to your original dimension, or else better solution is to pad the original shape with zeros and make it a multiple of 12. After following the second solution, you will have to consider the new dimension of input.

Updated code (only changed part)

self.inputs_ = tf.placeholder(tf.float32, input_shape, name='inputs')
pad_inputs = tf.pad(self.inputs_, [[0,0], [2, 3], [5, 6], [1, 2], [0, 0]]) # Pad at the edges
print(pad_inputs.shape)  # [1, 36, 84, 204, 3]

conv1 = tf.layers.conv3d(inputs= pad_inputs, filters=16, kernel_size=(3,3,3), padding= padding, strides = stride, activation=tf.nn.relu)

And at the end,

self.output = tf.reshape(self.output, pad_inputs.shape)