3
votes

The accuracy of my convolutional neural network decreases instead of increasing for a large number of training samples (100.000). For a smaller number of training samples (6.000), the accuracy increases until a point, then starts deceasing.

Example:

nr_training_examples 100000
tb 2500
epoch 0  loss 0.19646 acc 18.52
nr_test_examples 5000
Accuract test set 0.00
nr_training_examples 100000
tb 2500
epoch 1  loss 0.20000 acc 0.00
nr_test_examples 5000
Accuract test set 0.00
nr_training_examples 100000
tb 2500

What am I doing wrong?

I'm using photos of faces as training samples (70 x 70 pixels).

The network is inspired from VGG model:

2 x cov-3
max_pooling
2 x conv-3
max_pooling
2 X conv-3
1 X conv-1
max_pooling
2 X conv-3
1 X conv-1
max_pooling
fully_connected 1024
fully_connected 1024 - output 128

And here's the model:

def siamese_convnet(x):
    global keep_rate
    #reshape input

    w_conv1_1 = tf.get_variable(name='w_conv1_1', initializer=tf.random_normal([3, 3, 1, 64]))
    w_conv1_2 = tf.get_variable(name='w_conv1_2', initializer=tf.random_normal([3, 3, 64, 64]))

    w_conv2_1 = tf.get_variable(name='w_conv2_1', initializer=tf.random_normal([3, 3, 64, 128]))
    w_conv2_2 = tf.get_variable(name='w_conv2_2', initializer=tf.random_normal([3, 3, 128, 128]))

    w_conv3_1 = tf.get_variable(name='w_conv3_1', initializer=tf.random_normal([3, 3, 128, 256]))
    w_conv3_2 = tf.get_variable(name='w_conv3_2', initializer=tf.random_normal([3, 3, 256, 256]))
    w_conv3_3 = tf.get_variable(name='w_conv3_3', initializer=tf.random_normal([1, 1, 256, 256]))

    w_conv4_1 = tf.get_variable(name='w_conv4_1', initializer=tf.random_normal([3, 3, 256, 512]))
    w_conv4_2 = tf.get_variable(name='w_conv4_2', initializer=tf.random_normal([3, 3, 512, 512]))
    w_conv4_3 = tf.get_variable(name='w_conv4_3', initializer=tf.random_normal([1, 1, 512, 512]))

    w_conv5_1 = tf.get_variable(name='w_conv5_1', initializer=tf.random_normal([3, 3, 512, 512]))
    w_conv5_2 = tf.get_variable(name='w_conv5_2', initializer=tf.random_normal([3, 3, 512, 512]))
    w_conv5_3 = tf.get_variable(name='w_conv5_3', initializer=tf.random_normal([1, 1, 512, 512]))

    w_fc_1 = tf.get_variable(name='fc_1', initializer=tf.random_normal([2*2*512, 1024]))
    w_fc_2 = tf.get_variable(name='fc_2', initializer=tf.random_normal([1024, 1024]))

    fc_layer = tf.get_variable(name='fc_layer', initializer=tf.random_normal([1024, 1024]))
    w_out = tf.get_variable(name='w_out', initializer=tf.random_normal([1024, 128]))

    bias_conv1_1 = tf.get_variable(name='bias_conv1_1', initializer=tf.random_normal([64]))
    bias_conv1_2 = tf.get_variable(name='bias_conv1_2', initializer=tf.random_normal([64]))

    bias_conv2_1 = tf.get_variable(name='bias_conv2_1', initializer=tf.random_normal([128]))
    bias_conv2_2 = tf.get_variable(name='bias_conv2_2', initializer=tf.random_normal([128]))

    bias_conv3_1 = tf.get_variable(name='bias_conv3_1', initializer=tf.random_normal([256]))
    bias_conv3_2 = tf.get_variable(name='bias_conv3_2', initializer=tf.random_normal([256]))
    bias_conv3_3 = tf.get_variable(name='bias_conv3_3', initializer=tf.random_normal([256]))

    bias_conv4_1 = tf.get_variable(name='bias_conv4_1', initializer=tf.random_normal([512]))
    bias_conv4_2 = tf.get_variable(name='bias_conv4_2', initializer=tf.random_normal([512]))
    bias_conv4_3 = tf.get_variable(name='bias_conv4_3', initializer=tf.random_normal([512]))

    bias_conv5_1 = tf.get_variable(name='bias_conv5_1', initializer=tf.random_normal([512]))
    bias_conv5_2 = tf.get_variable(name='bias_conv5_2', initializer=tf.random_normal([512]))
    bias_conv5_3 = tf.get_variable(name='bias_conv5_3', initializer=tf.random_normal([512]))

    bias_fc_1 = tf.get_variable(name='bias_fc_1', initializer=tf.random_normal([1024]))
    bias_fc_2 = tf.get_variable(name='bias_fc_2', initializer=tf.random_normal([1024]))

    bias_fc = tf.get_variable(name='bias_fc', initializer=tf.random_normal([1024]))
    out = tf.get_variable(name='out', initializer=tf.random_normal([128]))

    x = tf.reshape(x , [-1, 70, 70, 1]);

    conv1_1 = tf.nn.relu(conv2d(x, w_conv1_1) + bias_conv1_1);
    conv1_2= tf.nn.relu(conv2d(conv1_1, w_conv1_2) + bias_conv1_2);

    max_pool1 = max_pool(conv1_2);

    conv2_1 = tf.nn.relu( conv2d(max_pool1, w_conv2_1) + bias_conv2_1 );
    conv2_2 = tf.nn.relu( conv2d(conv2_1, w_conv2_2) + bias_conv2_2 );

    max_pool2 = max_pool(conv2_2)

    conv3_1 = tf.nn.relu( conv2d(max_pool2, w_conv3_1) + bias_conv3_1 );
    conv3_2 = tf.nn.relu( conv2d(conv3_1, w_conv3_2) + bias_conv3_2 );
    conv3_3 = tf.nn.relu( conv2d(conv3_2, w_conv3_3) + bias_conv3_3 );

    max_pool3 = max_pool(conv3_3)

    conv4_1 = tf.nn.relu( conv2d(max_pool3, w_conv4_1) + bias_conv4_1 );
    conv4_2 = tf.nn.relu( conv2d(conv4_1, w_conv4_2) + bias_conv4_2 );
    conv4_3 = tf.nn.relu( conv2d(conv4_2, w_conv4_3) + bias_conv4_3 );

    max_pool4 = max_pool(conv4_3)

    conv5_1 = tf.nn.relu( conv2d(max_pool4, w_conv5_1) + bias_conv5_1 );
    conv5_2 = tf.nn.relu( conv2d(conv5_1, w_conv5_2) + bias_conv5_2 );
    conv5_3 = tf.nn.relu( conv2d(conv5_2, w_conv5_3) + bias_conv5_3 );

    max_pool5 = max_pool(conv5_3)

    fc_helper = tf.reshape(max_pool4, [-1, 2*2*512]);
    fc_1 = tf.nn.relu( tf.matmul(fc_helper, w_fc_1) + bias_fc_1 );
    #fc_2 = tf.nn.relu( tf.matmul(fc_1, w_fc_2) + bias_fc_1 );

    fc = tf.nn.relu( tf.matmul(fc_1, fc_layer) + bias_fc );

    output = tf.matmul(fc, w_out) + out

    output = tf.nn.l2_normalize(output, 0)

    return output
2

2 Answers

3
votes

the accuracy increases until a point, then starts deceasing.

It's a sign that your NN get overfitting. If you still doubt, try to check your cost function result, if it's increasing at some point I'm sure it is overfitting.

There are many common solutions to solve overfitting:

  • Increase the amount of your training data
  • Add Dropout function to your network (randomly turn off your neuron when training)
  • Add regularization (weight decay)

You can get details about above solutions here:

http://neuralnetworksanddeeplearning.com/chap3.html#overfitting_and_regularization

1
votes

Your network might be overfitting. Try adding dropout (with keep probability around 0.5) to your fully connected layers.