0
votes

I'm trying to use VGG16 from keras to train a model for image detection.

Based on these articles (https://www.pyimagesearch.com/2019/06/03/fine-tuning-with-keras-and-deep-learning/ and https://learnopencv.com/keras-tutorial-fine-tuning-using-pre-trained-models/), I've put some addition Dense layer to the VGG 16 model. However, the training accuracy with 20 epoche is around 35% to 41% which doesn't match the result of these articles (above 90%).

Due to this, I would like to know, did I do something wrong with my code below.

Basic setting

url='/content/drive/My Drive/fer2013.csv'
batch_size = 64
img_width,img_height = 48,48

# 0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral
num_classes = 7 
model_path = '/content/drive/My Drive/Af/cnn.h5'

df=pd.read_csv(url)  

def _load_fer():
    # Load training and eval data
    df = pd.read_csv(url, sep=',')
    train_df = df[df['Usage'] == 'Training']
    eval_df = df[df['Usage'] == 'PublicTest']
    return train_df, eval_df

def _preprocess_fer(df,label_col='emotion',feature_col='pixels'):
    labels, features = df.loc[:, label_col].values.astype(np.int32), [
        np.fromstring(image, np.float32, sep=' ')
        for image in df.loc[:, feature_col].values]
    
    labels = [to_categorical(l, num_classes=num_classes) for l in labels]

    features = np.stack((features,) * 3, axis=-1)
    features /= 255
    features = features.reshape(features.shape[0], img_width, img_height,3)
    return features, labels

# Load fer data
train_df, eval_df = _load_fer()

# preprocess fer data
x_train, y_train = _preprocess_fer(train_df)
x_valid, y_valid = _preprocess_fer(eval_df)

gen = ImageDataGenerator(
        rotation_range=40,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        fill_mode='nearest')

train_generator = gen.flow(x_train, y_train, batch_size=batch_size)
predict_size_train = int(np.math.ceil(len(x_train) / batch_size)) 

input_tensor = Input(shape=(img_width, img_height, 3)) 

Now comes the model training part

baseModel = VGG16(
    include_top=False, weights='imagenet',
    input_tensor=input_tensor
    )

# Construct the head of the model that will be placed on top of the base model (fine tuning)

headModel = baseModel.output
headModel = Flatten()(headModel)
headModel = Dense(1024, activation="relu")(headModel)
#headModel = Dropout(0.5)(headModel)
headModel = BatchNormalization()(headModel)
headModel = Dense(num_classes, activation="softmax")(headModel)

model = Model(inputs=baseModel.input, outputs=headModel)

for layer in baseModel.layers:
  layer.trainable = False

model summary

model.compile(loss='categorical_crossentropy', 
                       optimizer=tf.keras.optimizers.Adam(lr=0.001), 
                       metrics=['accuracy'])

 history = model.fit(train_generator, 
                    steps_per_epoch=predict_size_train * 1, 
                    epochs=20,
                    validation_data=valid_generator,
                    validation_steps=predict_size_valid)

Result: Result after training It will be very thankful for you advice. Best Regards.

2
Can you add the dataset details which model is trained on. Like size, batch_size etc. - Frightera
Have you tried lowering the learning rate to say 0.0001. Sometimes 0.001 is too high. - Dwight Foster
@Frightera I've just added some details about the data. - Kai-Chun Lin
@DwightFoster I'll try it later. But I've tried with different optimizers from keras which return quite difficult result. However, the accuracy is still below 45%. - Kai-Chun Lin

2 Answers

0
votes

Since you are freezing all layers, only one dense layer might not give you desired accuracy. Also if you are not in hurry, you may not set up the validation_steps and steps_per_epochs parameters. Also in this tutorial, model is having fluctuations, which do not want.

I suggest:

    for layer in baseModel.layers:
    layer.trainable = False

base_out = baseModel.get_layer('block3_pool').output // layer name may be different, 
                                                       check with model baseModel.summary

With that you can get spefic layer's output. After got the output, you can add some convolutions. After convolutions try stacking more dense layers like:

x = tf.keras.layers.Flatten()(x)

x = Dense(512, activation= 'relu')(x)
x = Dropout(0.3)(x)
x = Dense(256, activation= 'relu')(x)
x = Dropout(0.2)(x)

output_model = Dense(num_classes, activation = 'softmax')(x)

If you don't want to add convolutions and use baseModel completely, that's also fine however you can do something like this:

for layer in baseModel.layers[:12]: // 12 is random you can try different ones. Not 
                                       all layers are frozen this time.
    layer.trainable = False

for i, layer in enumerate(baseModel.layers):
       print(i, layer.name, layer.trainable)
       // check frozen layers   

After that, you can try to set:

headModel = baseModel.output
headModel = Flatten()(headModel)
headModel = Dense(1024, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(512, activation="relu")(headModel)
headModel = Dense(num_classes, activation="softmax")(headModel)

If you see your model is learning, but your loss having fluctuations then you can reduce learning rate. Or you can use ReduceLROnPlateau callback:

rd_lr = ReduceLROnPlateau(monitor='val_loss', factor = np.sqrt(0.1), patience= 4, verbose = 1, min_lr = 5e-8)

Parameters are totally up to your model. For more details you can see docs

0
votes

what is the form of the content of y_train. If they are integer values then you need to convert them to one hot vectors with

y_train=tf.keras.utils.to_categorical(train, num_classes)

since you are using loss='categorical_crossentropy' in model.compile. In addition VGG16 requires that the pixels be scaled between -1 and +1 so in include

gen = ImageDataGenerator(tf.keras.applications.vgg16.preprocess_input, etc

When you are training you have

for layer in baseModel.layers:
  layer.trainable = False

so you are only training the dense layer which is OK but may not give you high accuracy. You might want to leave VGG as trainable but of course this will take longer. Or after you train with VGG not trainable, then change it back to trainable and run a few more epochs to fine tune the model.