0
votes

I just started to study tensorflow recently. When doing some exercises, a question come up. For building hidden layers,there are two ways as I know to define.

  1. By using tf.layers.dense, to define fully connected layer e.g.

    layer_1=tf.layers.dense(X,512,activation=tf.nn.relu) layer_2=tf.layers.dense(layer_1,256,activation=tf.nn.relu)

  2. By using tf.add(tf.matmul(X,W),b), a direct matrix multiply, to define layer e.g.:

    w1= tf.Variable(tf.random_normal([in_size, out_size]))

    b1=....

    w2=....

    b2=...

    layer_1=tf.add(tf.matmul(x,w1),b1) layer_1=tf.nn.relu(relu) layer_2=tf.add(tf.matmul(layer_1,w2),b2) layer_2=tf.nn.relu(layer_2)

I tried these two ways to build a multilayer NN, both can work. My quesiton:is there difference between them? my guess: 1) In approach 2, W, b can be monitored by tensorboard since they are explicitly defined.

Appreciated for any feedback. Thanks

2

2 Answers

1
votes

Your first approach will use tf.glorot_uniform_initializer for initializing weights by default as mentioned in here and here, so there might be slight difference in performance. I think you can monitor weights using first approach as well.

1
votes

There is absolutely no difference between using tf.layers and defining your own layers by creating W and b matricies and then doing tf.matmul and tf.add. For example, the first snippet:

tf.reset_default_graph()
tf.set_random_seed(42)
X = tf.ones((5,4), dtype=tf.float32)
init = tf.initializers.random_uniform(minval=-0.1, maxval=0.1, dtype=tf.float32)
logits_first = tf.layers.dense(inputs = X, units = 7, kernel_initializer=init, 
bias_initializer=init)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
sess.run(logits_first)

evaluates to the same values as the second snippet:

tf.reset_default_graph()
tf.set_random_seed(42)
X = tf.ones((5,4), dtype=tf.float32)
W = tf.Variable(tf.random_uniform([4, 7], -0.1, 0.1), dtype=tf.float32)
b = tf.Variable(tf.random_uniform([7], -0.1, 0.1), dtype=tf.float32)
logits_second = tf.add(tf.matmul(X, W), b)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
sess.run(logits_second)

Besides, you can monitor both approaches. Everything that is defined within the graph creation can be monitored in tensorboard.