I have a neural network with 3 consecutive linear layers (convolution), with no activation functions in between. After training the network and obtaining the weights, I would like to collapse all 3 layers into one layer.
How can this be done in practice, when each layer has different kernel size and stride?
The layers are as follows:
- Convolution layer with a 3x3 kernel, 5 input channels and 5 output channels (a tensor of size 3x3x5x5), with stride 1 and padding "same"
- Convolution layer with a 5x5 kernel, 5 input channels and 50 output channels (a tensor of size 5x5x5x50), with stride 2 and padding "same"
- Convolution layer with a 3x3 kernel, 50 input channels and 50 output channels (a tensor of size 3x3x50x50), with stride 1 and padding "same"
Thanks in advance