0
votes

I am reading about backpropagation deep neural network, and as I understood, I can summarize the algorithm of that type of neural network as below :

1- Input x : Set the corresponding activation for the input layer

2- Feedforward: caclulate the error of the forward-propagation

3- Output error: Calculate the output error

4- Backpropagate the error : caclulate the error of the back-propagation

5- Output: Using the gradient of the cost function

That's ok, and then I check many codes of that type of deep network, below an example code with explanation :

### imports
import tensorflow as tf

### constant data
x  = [[0.,0.],[1.,1.],[1.,0.],[0.,1.]]
y_ = [[0.],[0.],[1.],[1.]]

### induction
# 1x2 input -> 2x3 hidden sigmoid -> 3x1 sigmoid output

# Layer 0 = the x2 inputs
x0 = tf.constant( x  , dtype=tf.float32 )
y0 = tf.constant( y_ , dtype=tf.float32 )

# Layer 1 = the 2x3 hidden sigmoid
m1 = tf.Variable( tf.random_uniform( [2,3] , minval=0.1 , maxval=0.9 , dtype=tf.float32  ))
b1 = tf.Variable( tf.random_uniform( [3]   , minval=0.1 , maxval=0.9 , dtype=tf.float32  ))
h1 = tf.sigmoid( tf.matmul( x0,m1 ) + b1 )

# Layer 2 = the 3x1 sigmoid output
m2 = tf.Variable( tf.random_uniform( [3,1] , minval=0.1 , maxval=0.9 , dtype=tf.float32  ))
b2 = tf.Variable( tf.random_uniform( [1]   , minval=0.1 , maxval=0.9 , dtype=tf.float32  ))
y_out = tf.sigmoid( tf.matmul( h1,m2 ) + b2 )


### loss
# loss : sum of the squares of y0 - y_out
loss = tf.reduce_sum( tf.square( y0 - y_out ) )

# training step : gradient decent (1.0) to minimize loss
train = tf.train.GradientDescentOptimizer(1.0).minimize(loss)


### training
# run 500 times using all the X and Y
# print out the loss and any other interesting info
with tf.Session() as sess:
  sess.run( tf.global_variables_initializer() )
  for step in range(500) :
    sess.run(train)

  results = sess.run([m1,b1,m2,b2,y_out,loss])
  labels  = "m1,b1,m2,b2,y_out,loss".split(",")
  for label,result in zip(*(labels,results)) :
    print ""
    print label
    print result

print ""

My question, the above code is calculating the error of the forward-propagation but I don't see any step for calculating the back-propagation error. In other words, following the above description, I can see the steps 1 (Input x) , 2 (Feedforward) , 3 (Output error) and 5 (Output) but the step number 4 which is (Backpropagate the error) is not shown in the code!! Is that right or something missing in the code? The problem that all codes I found online are following same steps in backpropagation deep neural networks! please, could you describe how the step of Backpropagate the error is happening the code or what should I add something to performing that step?

Thank you

1

1 Answers

1
votes

In simple terms, when you build the TF graph up to the point you are computing the loss in your code, TF will know on which tf.Variable (weights) the loss depends. Then, when you create the node train = tf.train.GradientDescentOptimizer(1.0).minimize(loss), and later run it in a tf.Session, the backpropagation is done for you in the background. To be more specific, the train = tf.train.GradientDescentOptimizer(1.0).minimize(loss) merges the following steps:

# 1. Create a GD optimizer with a learning rate of 1.0
optimizer = tf.train.GradientDescentOptimizer(1.0)
# 2. Compute the gradients for each of the variables (weights) with respect to the loss
gradients, variables = zip(*optimizer.compute_gradients(loss))
# 3. Update the variables (weights) based on the computed gradients
train = optimizer.apply_gradients(zip(gradients, variables))

In particular, step 1 and 2, summarize the backpropagation step. Hope that this makes things more clear for you!


Besides, I want to restructure the steps in your question:

  1. Input X: The input of the neural network.
  2. Forward pass: Propagating the input through the neural network, in order to get the output. In other words, multiplying the input X with each of the tf.Variable in your code.
  3. Loss: The mismatch between the obtained output in step 2 and the expected output.
  4. Computing the gradients: Computing the gradients for each of the tf.Variable (weights) with respect to the loss.
  5. Updating the weights: Updating each tf.Variable (weight) according to its corresponding gradient.

Please note that step 4 and 5 encapsulate backpropagation.