0
votes

I have just started with Computer Vision and in the current task i am classifying images in 4 categories.

Total number of image files=1043

I am using pretrained InceptionV3 and fine tuning it on my dataset.

This is what i have after the epoch: Epoch 1/5 320/320 [==============================] - 1925s 6s/step - loss: 0.4318 - acc: 0.8526 - val_loss: 1.1202 - val_acc: 0.5557

Epoch 2/5 320/320 [==============================] - 1650s 5s/step - loss: 0.1807 - acc: 0.9446 - val_loss: 1.2694 - val_acc: 0.5436

Epoch 3/5 320/320 [==============================] - 1603s 5s/step - loss: 0.1236 - acc: 0.9572 - val_loss: 1.2597 - val_acc: 0.5546

Epoch 4/5 320/320 [==============================] - 1582s 5s/step - loss: 0.1057 - acc: 0.9671 - val_loss: 1.3845 - val_acc: 0.5457

Epoch 5/5 320/320 [==============================] - 1580s 5s/step - loss: 0.0982 - acc: 0.9700 - val_loss: 1.2771 - val_acc: 0.5572 That is a huge difference. Kindly help me to figure out why is my model not able to generalize as it is fitting quite well on the train data.

my code for reference:-

from keras.utils import to_categorical
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D, Dropout
from keras.applications.inception_v3 import InceptionV3, preprocess_input

CLASSES = 4

# setup model
base_model = InceptionV3(weights='imagenet', include_top=False)
from sklearn.preprocessing import OneHotEncoder
x = base_model.output
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dropout(0.4)(x)
predictions = Dense(CLASSES, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)

for layer in base_model.layers:
    layer.trainable = False

model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])        

from sklearn.preprocessing import LabelEncoder
encoder = LabelEncoder()
df['Category']= encoder.fit_transform(df['Category'])



from keras.preprocessing.image import ImageDataGenerator

WIDTH = 299
HEIGHT = 299
BATCH_SIZE = 32


train_datagen = ImageDataGenerator(rescale=1./255,preprocessing_function=preprocess_input)



validation_datagen = ImageDataGenerator(rescale=1./255)


df['Category'] =df['Category'].astype(str)
#dfval['Category'] = dfval['Category'].astype(str)




from sklearn.utils import shuffle
df = shuffle(df)

from sklearn.model_selection import train_test_split

dftrain,dftest = train_test_split(df, test_size = 0.2, random_state = 0)

train_generator = train_datagen.flow_from_dataframe(dftrain,target_size=(HEIGHT, WIDTH),batch_size=BATCH_SIZE,class_mode='categorical', x_col='Path', y_col='Category')

validation_generator = validation_datagen.flow_from_dataframe(dftest,target_size=(HEIGHT, WIDTH),batch_size=BATCH_SIZE,class_mode='categorical', x_col='Path', y_col='Category')



EPOCHS = 5
BATCH_SIZE = 32
STEPS_PER_EPOCH = 320
VALIDATION_STEPS = 64

MODEL_FILE = 'filename.model'

history = model.fit_generator(
    train_generator,
    epochs=EPOCHS,
    steps_per_epoch=STEPS_PER_EPOCH,
    validation_data=validation_generator,
    validation_steps=VALIDATION_STEPS)

Any help would be appreciated :)

1
Hard to say without being able to look at your data. What does it look like ?Joseph Budin
Inception is a very deep neural network and works best on large amount of data, the number of images i.e. 1043 are very less for this type of network, hence it is overfitting. Try increasing the number of imagestechytushar
@JosephBudin images are scanned copies of documents.dataguy
@techytushar i am already augmenting the images.dataguy
@JosephBudin results without augmentation:- Epoch 1/5 320/320 [==============================] - 3497s 11s/step - loss: 0.3839 - acc: 0.8761 - val_loss: 1.4405 - val_acc: 0.4658dataguy

1 Answers

0
votes

If you don't use preprocess_input in "all" your data, you will get terrible results.

Look at these:

train_datagen = ImageDataGenerator(
    preprocessing_function=preprocess_input, 
    ...)

validation_datagen = ImageDataGenerator()

Now, I notice you are using rescale. Since you imported the correct preprocess_input function from the inception code, I really think you should not be using this rescale. The preprocess_input function is supposed to do all the necessary preprocessing. (Not all models were trained with normalized inputs)

But would rescale be a problem if you're applying it to both batasets?

Well... if the trainable=False applied correctly to the BatchNormalization layers, this means that these layers have stored values for mean and variation which will only work well if the data is within the expected range.