I'm praticing CNNs. I read some papers about training MNIST dataset use CNNs.size of image is 28x28 and use architecture 5 layers: input>conv1-maxpool1>conv2-maxpool2>fully connected>output
Convolutional Layer #1
- Computes 32 features using a 5x5 filter with ReLU activation.
- Padding is added to preserve width and height.
- Input Tensor Shape: [batch_size, 28, 28, 1]
- Output Tensor Shape: [batch_size, 28, 28, 32]
Pooling Layer #1
- First max pooling layer with a 2x2 filter and stride of 2
- Input Tensor Shape: [batch_size, 28, 28, 32]
- Output Tensor Shape: [batch_size, 14, 14, 32]
Convolutional Layer #2
- Computes 64 features using a 5x5 filter.
- Padding is added to preserve width and height.
- Input Tensor Shape: [batch_size, 14, 14, 32]
- Output Tensor Shape: [batch_size, 14, 14, 64]
Pooling Layer #2
- Second max pooling layer with a 2x2 filter and stride of 2
- Input Tensor Shape: [batch_size, 14, 14, 64]
- Output Tensor Shape: [batch_size, 7, 7, 64]
Flatten tensor into a batch of vectors
- Input Tensor Shape: [batch_size, 7, 7, 64]
- Output Tensor Shape: [batch_size, 7 * 7 * 64]
Fully Connected Layer
- Densely connected layer with 1024 neurons
- Input Tensor Shape: [batch_size, 7 * 7 * 64]
- Output Tensor Shape: [batch_size, 1024] Output layer
- Input Tensor Shape: [batch_size, 1024]
- Output Tensor Shape: [batch_size, 10]
In conv1, with 1 input computates 32 features using a 5x5 filter and in conv2 with 32 input from conv1 computates 64 features using same filter. What are parameters such as 32,64,2x2 filter chosen based on? Do they based on size of image?
If size of images is larger than 28x28 such as 128x128. Should I increse the number of layers over 5 layers? How are above parameters changed with other size of images?
Thank advance