1
votes

I'm going through the 'Expert MINST' tf tutorial (https://www.tensorflow.org/versions/r0.8/tutorials/mnist/pros/index.html) and I'm stuck on this part:

Densely Connected Layer

Now that the image size has been reduced to 7x7, we add a fully-connected layer with 1024 neurons to allow processing on the entire image. We reshape the tensor from the pooling layer into a batch of vectors, multiply by a weight matrix, add a bias, and apply a ReLU.

Why the number 1024? Where did that come from?

My understanding with the Fully Connected Layer is that it has to somehow get back to the original image size (and then we start plugging things into our softmax equation). In this case, the original image size is Height x Width x Channels = 28*28*1 = 784... not 1024.

What am I missing here?

1

1 Answers

3
votes

1024 is just an arbitrary number of hidden units. At this point, input to the network is reduced to 64 planes, each of size 7x7 pixels. They do not try to "get back to the original image size", they simply claim, that they want a layer which can extract global features, thus they want it to be densly connected to every single neuron from last pooling layer (which represents your input space), while previous operations (convolutions and poolings) were local features.

Thus, in order to work with this in MLP manner, you need 7*7*64=3136 neurons. They add another layer of 1024 on top, so if you draw your network, it would be something among the lines of

 INPUT - CONV - POOL - .... - CONV - POOL - HIDDEN - OUTPUT

28 x 28-               ....         7*7*64   1024      10
                                    =3136

The number is thus quite arbitary, they simply empirically tested that it works, but you could use any number of units here, or any number of layers.