Tensor Flow - low accuracy on CNN Mnist Data set / How to batch accuracy calculations

Question

The code is in python 3.5.2 with Tensor flow. The neural network returns an accuracy of between .10 and 5.00, with the higher value tending to be the accuracy of the training data by a factor of roughly 6. I cannot tell whether the neural network is legitimately doing worse than random guessing or if the accuracy code i am using has a serious fault i cannot see.

The neural network consists of 5 layers:

input
conv1 (with max pooling relu and dropout)
conv2 (with max pooling relu and dropout)
fully connected (with relu)
output

uses default Adam optimizer

I feel very suspicious of my accuracy calculations as i made them differently than what i have seen due to RAM constraints. The accuracy calculation does both the accuracy of the train and test data.

        acc_total = 0
        correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
        for _ in range(int(mnist.test.num_examples/batch_size)):
            test_x, test_y = mnist.test.next_batch(batch_size)
            acc = accuracy.eval(feed_dict={x: test_x, y: test_y})
            acc_total += acc
            print('Accuracy:',acc_total*batch_size/float(mnist.test.num_examples),end='\r')
        print('Epoch', epoch, 'current test set accuracy : ',acc_total*batch_size/float(mnist.test.num_examples))

        acc_total=0
        correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
        for _ in range(int(mnist.train.num_examples/batch_size)):
            train_x, train_y = mnist.train.next_batch(batch_size)
            acc = accuracy.eval(feed_dict={x: train_x, y: train_y})
            acc_total += acc
            print('Accuracy:',acc_total*batch_size/float(mnist.train.num_examples),end='\r')
        print('Epoch', epoch, 'current train set accuracy : ',acc_total*batch_size/float(mnist.test.num_examples))

This is a sample of the outputs:

Epoch 0 completed out of 20 loss: 10333239.3396 83.29 ts 429
Epoch 0 current test set accuracy : 0.7072
Epoch 0 current train set accuracy : 3.8039
Epoch 1 completed out of 20 loss: 1831489.40747 39.24 ts 858
Epoch 1 current test set accuracy : 0.7765
Epoch 1 current train set accuracy : 4.2239
Epoch 2 completed out of 20 loss: 1010191.40466 25.89 ts 1287
Epoch 2 current test set accuracy : 0.8069
Epoch 2 current train set accuracy : 4.3898
Epoch 3 completed out of 20 loss: 631960.809082 0.267 ts 1716
Epoch 3 current test set accuracy : 0.8277
Epoch 3 current train set accuracy : 4.4955
Epoch 4 completed out of 20 loss: 439149.724823 2.001 ts 2145
Epoch 4 current test set accuracy : 0.8374
Epoch 4 current train set accuracy : 4.5674

The full code is as follows (sorry about the length i added a lot of comments for my own use ):

    import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

#Imported Data set
mnist = input_data.read_data_sets("/tmp/data/", one_hot = True)

#ammount of output classes
n_classes = 10

#ammount of examples processed at once
#memory impact of ~500MB for 128 with more on eval runs
batch_size = 128

#Times to cycle through the entire imput data set
epoch_amm =20

#Input and outputs placeholders
x = tf.placeholder(tf.float32, [None, 784])
y = tf.placeholder(tf.float32)

#Dropout is 1-keeprate; fc- fully conected layer dropout;conv conv layer droupout
keep_rate_fc=.5
keep_rate_conv=.75
keep_prob=tf.placeholder(tf.float32)

#Regularization paramaters
Regularization_active= False #True and False MUST be capitalized
Lambda= 1.0 #'weight' of the weights on the loss function

# counter for total steps taken by trainer 
training_steps = 1

#Learning Rate For Network
base_Rate   = .03
decay_steps = 64
decay_rate  = .96
Staircase   = True
Learning_Rate = tf.train.exponential_decay(base_Rate, training_steps, decay_steps, decay_rate, staircase='Staircase', name='Exp_decay' )

#Convolution Function returns neuronns that act on a section of prev. layer
def conv2d(x,W):
    return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')

#Pooling function returns max value in 2 by 2 sections    
def maxpool2d(x):
    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')

def relu(x):
    return tf.nn.relu(x,'relu')

def add(x, b):
    return tf.add(x,b)


#'Main' method, contains the Neural Network    
def convolutional_neural_network(x):
    weights = {'W_conv1':tf.Variable(tf.random_normal([5,5,1,32])),
               'W_conv2':tf.Variable(tf.random_normal([5,5,32,64])),
               'W_fc':tf.Variable(tf.random_normal([7*7*64,1024])),
               'W_out':tf.Variable(tf.random_normal([1024,n_classes]))}

    biases = {'B_conv1':tf.Variable(tf.random_normal([32])),
              'B_conv2':tf.Variable(tf.random_normal([64])),       
              'B_fc':tf.Variable(tf.random_normal([1024])),
              'B_out':tf.Variable(tf.random_normal([n_classes]))}

    # Input layer
    x = tf.reshape(x, shape=[-1,28,28,1])

    #first layer. pass inputs through conv2d and save as conv1 then apply maxpool2d
    conv1 = conv2d(x,weights['W_conv1'])
    conv1 = add(conv1,biases['B_conv1'])
    conv1 = relu(conv1)
    conv1 = maxpool2d(conv1)
    conv1 = tf.nn.dropout(conv1,keep_rate_conv)

    #second layer does same as first layer 
    conv2 = conv2d(conv1,weights['W_conv2'])
    conv2 = add(conv2,biases['B_conv2'])
    conv2 = relu(conv2)
    conv2 = maxpool2d(conv2)
    conv2 = tf.nn.dropout(conv2,keep_rate_conv)

    #3rd layer fully connected
    fc = tf.reshape(conv2,[-1,7*7*64])
    fc = tf.matmul(fc,weights['W_fc'])
    fc = add(fc,biases['B_fc'])
    fc = relu(fc)
    fc = tf.nn.dropout(fc,keep_rate_fc)

    #4th and final layer
    output = tf.matmul(fc,weights['W_out'])
    output = add(output,biases['B_out'])

    return output

#Trains The neural Network
def train_neural_network(x):
    training_steps = 0
    #Initiate The Network
    prediction = convolutional_neural_network(x)

    #Define the Cost and Cost function
    #tf.reduce_mean averages the values of a tensor into one value
    cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(prediction,y) )

    #Apply Regularization if active
    #if Regularization_active :
    #    print('DEBUG!! LINE 84 REGULARIZATION ACTIVE')
    #    cost = (tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(prediction,y))+
    #        (Lambda*(tf.nn.l2_loss(weight['W_conv1'])+
    #        tf.nn.l2_loss(weight['W_conv2'])+
    #        tf.nn.l2_loss(weight['W_fc'])+
    #        tf.nn.l2_loss(weight['W_out'])+
    #        tf.nn.l2_loss(biases['B_conv1'])+
    #        tf.nn.l2_loss(biases['B_conv2'])+
    #        tf.nn.l2_loss(biases['B_fc'])+
    #        tf.nn.l2_loss(biases['B_out']))))

    #Optimizer + Learning_Rate passthrough
    optimizer = tf.train.AdamOptimizer().minimize(cost)

    #Get Epoch Ammount 
    hm_epochs = epoch_amm

    #Starts C++ Training session
    print('Session Started')
    with tf.Session() as sess:
        #Initiate all Variables
        sess.run(tf.global_variables_initializer())

        #Begin Logs
        summary_writer = tf.summary.FileWriter('/tmp/logs',sess.graph)

        #Start Training
        for epoch in range(hm_epochs):

            epoch_loss = 0

            for count in range(int(mnist.train.num_examples/batch_size)):
                training_steps = (training_steps+1)
                epoch_x, epoch_y = mnist.train.next_batch(batch_size)
                count, c = sess.run([optimizer, cost], feed_dict={x: epoch_x, y: epoch_y})
                epoch_loss += c
                print('Epoch', epoch, 'current epoch loss', epoch_loss, 'batch loss', c,'ts',training_steps,'    ', end='\r')
            #Log the loss per epoch
            print('Epoch', epoch, 'completed out of',hm_epochs,'loss:',epoch_loss,'      ')



            acc_total = 0
            correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
            accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
            for _ in range(int(mnist.test.num_examples/batch_size)):
                test_x, test_y = mnist.test.next_batch(batch_size)
                acc = accuracy.eval(feed_dict={x: test_x, y: test_y})
                acc_total += acc
                print('Accuracy:',acc_total*batch_size/float(mnist.test.num_examples),end='\r')
            print('Epoch', epoch, 'current test set accuracy : ',acc_total*batch_size/float(mnist.test.num_examples))

            acc_total=0
            correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
            accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
            for _ in range(int(mnist.train.num_examples/batch_size)):
                train_x, train_y = mnist.train.next_batch(batch_size)
                acc = accuracy.eval(feed_dict={x: train_x, y: train_y})
                acc_total += acc
                print('Accuracy:',acc_total*batch_size/float(mnist.train.num_examples),end='\r')
            print('Epoch', epoch, 'current train set accuracy : ',acc_total*batch_size/float(mnist.test.num_examples))

        print('Complete')

    sess.close()


#Run the Neural Network
train_neural_network(x)

Code is based on that shown in this YouTube video, the code shown in the video achieves a much higher accuracy than mine. youtube.com/… — Burak

Burak Burak · Accepted Answer · 2017-01-20T00:18:10

The CNN had low results because of 4 reasons:

Improper (Lack of) feeding of dropout -the keep rate was not being fed into accuracy.eval(feed_dict={x: test_x, y: test_y}) causing the network to underpreform in its accuracy evaluations
Poor Initialization of weights
- RELU neuron work significantly better with weights closer to zero than normal distribution.
far to high learning rate
- Learning rate of .03 even with decay was far far to high and stoped it from training effectively
errors in accuracy function
- The accuracy function of the training data was receiving the size of the data set form mnist.test.num_examples instead of the proper mnist.train.num_examples and caused nonsensical values of accuracy in excess of 100%

Tensor Flow - low accuracy on CNN Mnist Data set / How to batch accuracy calculations

1 Answers