I am trying to retrain VGG16 for a new dataset by using transfer learning. I have loaded the model with ImageNet weights and without the top fully connected layers, obtained predictions on the dataset from the bottleneck layers, and trained a small model with those bottleneck predictions. However, the validation accuracy is very low at 0.002 after 50 epochs. I am unable to figure out where the problem lies in my code, which is a modified version of the InceptionV3 retraining code from the Keras docs. I have been able to retrain ResNet50 on the same dataset with an accuracy of 0.88. My code is as below.
from keras.applications import VGG16
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, Flatten, Dropout
from keras import backend as K
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Input
img_width, img_height = 224, 224
train_data_dir = 'Dataset/train'
validation_data_dir = 'Dataset/test'
nb_train_samples = 31119
nb_validation_samples = 13362
nb_epoch = 50
nb_classes = 281
batch_size = 16
input_tensor = Input(shape=(224, 224, 3))
base_model = VGG16(weights="imagenet", input_tensor=input_tensor, include_top=False)
x = base_model.output
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
x = Dropout(0.5)(x)
predictions = Dense(nb_classes, activation='sigmoid')(x)
model = Model(input=base_model.input, output=predictions)
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics= ['accuracy'])
train_datagen = ImageDataGenerator(
rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=16,
class_mode='categorical'
)
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=16,
class_mode='categorical'
)
history = model.fit_generator(
train_generator,
nb_epoch=nb_epoch,
steps_per_epoch=nb_train_samples/batch_size,
validation_data=validation_generator,
validation_steps=nb_validation_samples/batch_size)
softmaxactivation at the end when usingcategorical_crossentropy. Any specific reason to use asigmoid? Also, just a couple of comments: you don't need to setsteps_per_epochandvalidation_stepsif you want all samples to be used;target_sizeshould always be a(height, width)tuple. You reversed its members. - ldavid