Tensorflow tf.matmul in MNIST tutorial

Question

I'm new in using tensorflow. Here is my question :

In the MNIST tutorial : https://www.tensorflow.org/versions/master/get_started/mnist/beginners#mnist-for-ml-beginners :

First, we multiply x by W with the expression tf.matmul(x, W). This is flipped from when we multiplied them in our equation, where we had Wx, as a small trick to deal with x being a 2D tensor with multiple inputs. We then add b, and finally apply tf.nn.softmax.

My question is :

Why b was initialized as a vector : b = tf.Variable(tf.zeros([10]))

and not as b = tf.Variable(tf.zeros([None,10])) or b = tf.Variable(tf.zeros([1,10]))?

Since the shapes of x * W + b are : [None , 784] * [784 , 10] + [None,10]

Thanks for your answers.

Vladimir Bystricky Vladimir Bystricky · Accepted Answer · 2017-10-23T12:11:23

This is because, we apply same operation for every "image" from the batch, not to batch. Thus x * W + b are not

[None , 784] * [784 , 10] + [None,10]

but are

results = merge(None, [784] * [784, 10] + [10])

Tensorflow tf.matmul in MNIST tutorial

1 Answers