1
votes

I am trying to do transfer learning in Keras + Tensorflow on a selected subset of Places-205 dataset, containing only 27 categories. I am using InceptionV3, DenseNet121 and ResNet50, pre-trained on ImageNet, and add a couple of extra layers to adapt to my classes. If the model is ResNet, I add Flatten + Dense for classfication, and if it is DenseNet or Inceptionv3, I add Global Avg Pool + Dense (relu) + Dense (classification). This is the code snippet:

x = base_model.output
if FLAGS.model in 'resnet50':
    x = Flatten(name="flatten")(x)
else:
    x = GlobalAveragePooling2D()(x)
    # Let's add a fully-connected layer
    x = Dense(1024, activation = 'relu')(x)
# And a logistic layer
predictions = Dense(classes, activation = 'softmax')(x)

For DenseNet and Inceptionv3 the training is ok, and the validation accuracy hits 70%, but for ResNet the validation accuracy stays fixed at 0.0369/0.037 (which is 1/27, my number of classes). It seems like it always predicts one class, but it's weird because its training progresses ok and the unspecific model code is exactly the same as for DenseNet and InceptionV3, which do work as expected.

Do you have any idea why it happens?

Thanks a lot!

1
Are you sure you're hitting the flatten line? You're using in instead of == in your comparison. You should be using in with a list or something similar.Daniel Möller
Yes, I printed the model summary and it has the Flatten layerCiprian Andrei Focsaneanu
Ok... could the extra dense layer be more important than expected?Daniel Möller
I suspect you may have a problem with a high learning rate for that model. It rushes to some place and can't get the details.Daniel Möller
I also thought of this (the high learning rate), and changed it but the situation didn't change much (the validation accuracy grows to maximum 0.037). The confusion matrix indicates that the 21st category (from 27) gets always predictedCiprian Andrei Focsaneanu

1 Answers

0
votes

I had a similar issue as you @Ciprian Andrei Focsaneanu, and what I have found to have worked was to make the previous layers (before the fully connected layers) trainable, as the filters/features of the ResNet50 were not suitable for my application.

Strangely enough, I also trained the VGG16 models, which was initially on the same images (imagenet) but its filters worked for my application, but I digress.

Here's the link to a page that inspired me to do this: https://datascience.stackexchange.com/questions/16840/multi-class-neural-net-always-predicting-1-class-after-optimization

Hope this helps!