1
votes

I am trying to retrain VGG16 for a new dataset by using transfer learning. I have loaded the model with ImageNet weights and without the top fully connected layers, obtained predictions on the dataset from the bottleneck layers, and trained a small model with those bottleneck predictions. However, the validation accuracy is very low at 0.002 after 50 epochs. I am unable to figure out where the problem lies in my code, which is a modified version of the InceptionV3 retraining code from the Keras docs. I have been able to retrain ResNet50 on the same dataset with an accuracy of 0.88. My code is as below.

from keras.applications import VGG16
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, Flatten, Dropout
from keras import backend as K
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Input

img_width, img_height = 224, 224

train_data_dir = 'Dataset/train'
validation_data_dir = 'Dataset/test'

nb_train_samples = 31119
nb_validation_samples = 13362
nb_epoch = 50
nb_classes = 281
batch_size = 16

input_tensor = Input(shape=(224, 224, 3))
base_model = VGG16(weights="imagenet", input_tensor=input_tensor, include_top=False)

x = base_model.output
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
x = Dropout(0.5)(x)
predictions = Dense(nb_classes, activation='sigmoid')(x)

model = Model(input=base_model.input, output=predictions)

for layer in base_model.layers:
    layer.trainable = False

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=    ['accuracy'])

train_datagen = ImageDataGenerator(
        rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_width, img_height),
    batch_size=16,
    class_mode='categorical'
)

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_width, img_height),
    batch_size=16,
    class_mode='categorical'
)

history = model.fit_generator(
    train_generator,
    nb_epoch=nb_epoch,
    steps_per_epoch=nb_train_samples/batch_size,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples/batch_size)
1
We usually go for the softmax activation at the end when using categorical_crossentropy. Any specific reason to use a sigmoid? Also, just a couple of comments: you don't need to set steps_per_epoch and validation_steps if you want all samples to be used; target_size should always be a (height, width) tuple. You reversed its members. - ldavid

1 Answers

0
votes

VGG16 uses the Sequential model from Keras, ResNet the functional API. Therefore, you should replace

x = base_model.output
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
x = Dropout(0.5)(x)
predictions = Dense(nb_classes, activation='sigmoid')(x)

model = Model(input=base_model.input, output=predictions)

by

model = Models.Sequential()
model.add(base_model)
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes, activation="softmax"))

Also use softmax instead of sigmoid when using categorical class_mode.