0
votes

I'm trying to train simple neural network that consists of:

  1. Convolution layer filter (5x5) x 8, stride 2.
  2. Max pooling 25x25 (the image has kinda low amount of details)
  3. Flatting output into (2x2x8) vector
  4. Classifier with logistic regression

Altogether network has < 1000 weights.

File: nn.py

#!/bin/python 
import tensorflow as tf
import create_batch

# Prepare data
batch = create_batch.batch

x = tf.reshape(batch[0], [-1,100,100,3])
y_ = batch[1]


# CONVOLUTION NETWORK

# For initialization  
def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.3)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.2, shape=shape)
  return tf.Variable(initial)

# Convolution with stride 1
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 2, 2, 1], padding='SAME')

def max_pool_25x25(x):
  return tf.nn.max_pool(x, ksize=[1, 25, 25, 1],
                    strides=[1, 25, 25, 1], padding='SAME')

# First layer
W_conv1 = weight_variable([5, 5, 3, 8])
b_conv1 = bias_variable([8])

x_image = tf.reshape(x, [-1,100,100,3])

# First conv1
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_25x25(h_conv1)


# Dense connection layer
# make data flat
W_fc1 = weight_variable([2 * 2 * 8, 2])
b_fc1 = bias_variable([2])

h_pool1_flat = tf.reshape(h_pool1, [-1, 2*2*8])
y_conv = tf.nn.softmax(tf.matmul(h_pool1_flat, W_fc1) + b_fc1)

#Learning
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv),    reduction_indices=[1]))
train_step =    tf.train.GradientDescentOptimizer(0.001).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# Session
sess = tf.Session()
sess.run(tf.initialize_all_variables())

# Start input enqueue threads.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)

for i in range(200):
  if i%10 == 0:
    train_accuracy = accuracy.eval(session=sess)
    print("step %d, training accuracy %g"%(i, train_accuracy))

  train_step.run(session=sess)

File: create_batch.py

#!/bin/python
import tensorflow as tf

PATH1 = "../dane/trening/NK/"
PATH2 = "../dane/trening/K/"


def create_labeled_image_list():

    filenames = [(PATH1 + "nk_%d.png" % i) for i in range(300)]
    labels = [[1,0] for i in range(300)]

    filenames += [(PATH2 + "kulki_%d.png" % i) for i in range(300)]
    labels += [[0,1] for i in range(300)]

    return filenames, labels

def read_images_from_disk(input_queue):
    label = input_queue[1]
    file_contents = tf.read_file(input_queue[0])
    example = tf.image.decode_png(file_contents, channels=3)
    example.set_shape([100, 100, 3])
    example = tf.to_float(example)
    print ("READ, label:")
    print(label)
    return example, label

# Start
image_list, label_list = create_labeled_image_list()

# Create appropriate tensors for naming
images = tf.convert_to_tensor(image_list, dtype=tf.string)
labels = tf.convert_to_tensor(label_list, dtype=tf.float32)

input_queue = tf.train.slice_input_producer([images, labels],
                                        shuffle=True)

image, label = read_images_from_disk(input_queue)
batch = tf.train.batch([image, label], batch_size=600)

I'm feeding 100x100 images i have two classess 300 images each. Basically randomly initialzied network at step 0 has better accuracy than trained one. Network stops learning after it reaches 0.5 accuracy (basically coin flip). Images contain blue blooby thing (class 1) or grass (class 2).

I'm traning network using whole imageset at once (600 images), the loss function is cross entropy.

What I'm doing wrong?

1
Could you please post whole code? With the code on how you actually execute the training session etc. Also weight variables are initialized in an odd way (they are not tf.Variable). Honestly I would suggest you to check my code here, I create convnets there and train them: github.com/MaxKHK/Udacity_DeepLearningAssignments/blob/master/…Maksim Khaitovich
@MaximHaytovich whole code posts are never encouraged on stack and actually viloates posting terms.csharpdude77
@david I meant 'post on pastebin and add link here' or smth like it - original question had really too little info to work withMaksim Khaitovich
@MaximHaytovich sorry my mistake budycsharpdude77

1 Answers

1
votes

OK, I've find a fix there were two errors, now the network is learning.

  1. Images were RGBA despite the fact I declared them as RGB in tf
  2. I did not perform normalization of Images to [-1,1] float32.

In tensorflow it should be done with something like this:

# i use "im" for image
tf.image.convert_image_dtype(im, dtype=float32)
im = tf.sub(im, -0.5)
im = tf.mul(im, 2.0)

To all newbies to ML - prepare your data with caution!

Thanks.