2
votes

I have a small dataset of ~150 images. Each image has an object (rectangle box with white and black color) placed on the floor. The object is same in all images but the pattern of the floor is different. The objective is to train network to find the center of the image. Each image is of dimension 256x256x3.

Train_X is of size 150x256x256x3 and Train_y is of size 150x2 (150 here indicates the total number of images)

I understand 150 images is too small a dataset, but I am ok giving up on some accuracy so I trained data on Conv nets. Here is the architecture of convnet I used

  • Conv2D layer (filter size of 32)
  • Activation Relu
  • Conv2D layer (filter size of 64)
  • Activation Relu
  • Flattern layer
  • Dense(64) layer
  • Activation Relu
  • Dense(2)
  • Activation Softmax
  • model.compile(loss='mse', optimizer='sgd')

Observation: Trained model always return the normalized center of image 0.5,0.5 as the center of 'object' even on the training data. I was hoping to get center of a rectangular object rather than the center of the image when I run predict function on train_X. Am I getting this output because of my conv layer selections?

3
Try switching activation to sigmoid. When you use softmax you add a spurious condition to your output - mainly - coordinates summing up to 1.Marcin Możejko
I tried softmax as well but the result is same. I am not sure why all the predicted values of train and test set are giving normalized center of image as center of the object.visionStudent
softmax or sigmoid?Marcin Możejko
I mean to say I tried using sigmoid as well. Still getting normalized center as predicted output. Tried MSE, ASE as loss functions as well, and still getting same problemvisionStudent

3 Answers

0
votes

Since you haven't mentioned it in the details, the following suggestions (if you haven't implemented them already), could help:

1) Normalizing the input data (say for e.g, if you are working on input images, x_train = x_train/255 before feeding the input to the layer)

2) Try linear activation for the last output layer

3) Running the fitting over higher epochs, and experimenting with different batch sizes

0
votes

You are basically trying to solve a regression problem. Apart from what you have done, there are few other things that you can try:

  1. Use ImageAugmentation technique to generate more data. Also, normalize the images.
  2. Make a deeper model with a few more convolution layers.
  3. Use a proper weights initializer maybe He-normal for the convolution layers.
  4. Use BatchNormalization between layers to make the mean and std of your filter values equal to 0 and 1 respectively.
  5. Use crossentropy loss as it helps in better calculation of your gradients. In MSE the gradients become very small over time although it seemed to be preferred for regression problems.
  6. Try to change the optimizer to Adam.
  7. In case, you have a few more classes in your dataset, and you have class imbalance problem, you can use Focal loss, a variant of crossentropy loss which penalizes the misclassified labels more than the correctly classified labels. Also, reducing the batch size and Upsampling should help.
  8. Use Bayesian Optimization techniques for hyperparameter tuning of your model.

A sample model code:

with open(os.path.join(DATA_DIR, 'mnist.pickle'), 'rb') as fr:
    X_train, Y_train, X_val, Y_val = pickle.load(fr)
X_train = X_train.reshape(60000, 784)
X_val = X_val.reshape(10000, 784)
X_train = X_train.astype('float32')
X_val = X_val.astype('float32')
X_train /= 255
X_val /= 255
nb_classes = 10
Y_train = to_categorical(Y_train, nb_classes)
Y_val = to_categorical(Y_val, nb_classes)
return X_train, Y_train, X_val, Y_val

def build_model(input_shape, dropout=True):
    model = Sequential()
    model.add(Conv2D(32, (5,5), activation='relu', kernel_initializer='he_uniform', padding='valid', input_shape=input_shape))
    model.add(BatchNormalization())
    model.add(MaxPooling2D((2,2), strides=1, padding='valid'))
    if dropout:
        model.add(Dropout(0.2))
    model.add(Conv2D(64, (3,3), activation='relu', kernel_initializer='he_uniform', padding='valid'))
    model.add(Conv2D(128, (3,3), activation='relu', kernel_initializer='he_uniform', padding='valid'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D((2,2), strides=2, padding='valid'))
    if dropout:
        model.add(Dropout(0.2))
    model.add(Flatten())
    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
    model.add(BatchNormalization())
    model.add(Dense(classes, activation='softmax', kernel_initializer='he_uniform'))
    # optimizer = SGD(lr=0.01, decay-1e-6, momentum=0.9)
    optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
    model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
    return model
0
votes

I think using the "SoftMax" activation in the last layers is the main reason that your network can't perform weakly so you can use Relu or any other linear activation or use non. also I suggest you use PreTrained networks middle output like VGG so you won't need to train the Conv part and just train the dense part. In case of your little data you can use keras image generators to augment more images like below.

(x_train, y_train), (x_test, y_test) = cifar10.load_data()
y_train = np_utils.to_categorical(y_train, num_classes)
y_test = np_utils.to_categorical(y_test, num_classes)
datagen = ImageDataGenerator(
    featurewise_center=True,
    featurewise_std_normalization=True,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True)
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(x_train)
# fits the model on batches with real-time data augmentation:
model.fit(datagen.flow(x_train, y_train, batch_size=32),
          steps_per_epoch=len(x_train) / 32, epochs=epochs)
# here's a more "manual" example
for e in range(epochs):
    print('Epoch', e)
    batches = 0
    for x_batch, y_batch in datagen.flow(x_train, y_train, batch_size=32):
        model.fit(x_batch, y_batch)
        batches += 1
        if batches >= len(x_train) / 32:
            # we need to break the loop by hand because
            # the generator loops indefinitely
            break

so as a summary just do this:

  • Delete Softmax activation or use Linear ones like Relu or LeakyRelu.
  • Use Pretrained network for feature extraction.
  • Use image augmentation to create more images.