I am working with the MNIST data set and I am using keras to train a convolutional neural network. There is something about the weight matrices that I do not understand.
The input layer has 28x28=784 neurons. Then I use:
Conv2D(32,kernel_size=(7,7),stride=(3,3),use_bias=False)
Conv2D(64,kernel_size=(5,5),stride=(2,2),use_bias=False)
Flatten()
Dense(200,use_bias=False)
Dense(150,use_bias=False)
Dense(10,use_bias=False,softmax)
After I train the model and put W = model.get_weights(), I print W[i].shape for each i and obtain:
(7,7,1,32)
(5,5,32,64)
(256,200)
(200,150)
(150,10)
As far as I understand, this means that for the first hidden layer there are 32 images of 8x8=64 (since (28-7)/3+1=8) and therefore there are 64x32=2048 neurons in the first hidden layer.
The next part is the one that confuses me. Since the next convolution has kernel size (5,5) and stride (2,2) and uses 64 filters, does this means that we apply 64 convolutions to each 8x8 image obtained in the first hidden layer? That will give 64x32=2048 images of size 2x2 and there will be 2048x4=8192 neurons in the second hidden layer. But the weight matrix of the next layer is of shape (256,200). Shouldn't it be of shape (8192,200)? What is happening here?