I am using Keras Tensorflow in Colab and I am working on the oxford_flowers102 dataset. Task is image classification. With quite many categories (102) and not so many images per class. I tried to build different neural networks, starting from simple one to more complex ones, with and without image augmentation, dropout, hyper parameter tuning, batch size adjustment, optimizer adjustment, image resizing size .... however, I was not able to find a good CNN which gives me an accetable val_accuracy and finally a good test accuracy. Up to now my max val_accuracy I was able to get was poor 0.3x. I am pretty sure that it is possible to get better results, I am somehow just not finding the right CNN setup. My code so far:
import tensorflow as tf
from keras.models import Model
import tensorflow_datasets as tfds
import tensorflow_hub as hub
# update colab tensorflow_datasets to current version 3.2.0,
# otherwise tfds.load will lead to error when trying to load oxford_flowers102 dataset
!pip install tensorflow_datasets --upgrade
# restart runtime
oxford, info = tfds.load("oxford_flowers102", with_info=True, as_supervised=True)
train_data=oxford['train']
test_data=oxford['test']
validation_data=oxford['validation']
IMG_SIZE = 224
def format_example(image, label):
image = tf.cast(image, tf.float32)
image = image*1/255.0
image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
return image, label
train = train_data.map(format_example)
validation = validation_data.map(format_example)
test = test_data.map(format_example)
BATCH_SIZE = 32
SHUFFLE_BUFFER_SIZE = 1000
train_batches = train.shuffle(SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE)
test_batches = test.batch(BATCH_SIZE)
validation_batches = validation.batch(BATCH_SIZE)
First model I tried:
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, 3, activation='relu', input_shape=(IMG_SIZE, IMG_SIZE, 3)),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(102)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_batches, validation_data=validation_batches, epochs=20)
Epoch 20/20 32/32 [==============================] - 4s 127ms/step - loss: 2.9830 - accuracy: 0.2686 - val_loss: 4.8426 - val_accuracy: 0.0637
When I run it for more epochs, it overfits, val_loss goes up, val_accuracy does not go up.
Second model (very simple one):
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(IMG_SIZE, IMG_SIZE, 3)),
tf.keras.layers.Dense(128,activation='relu'),
tf.keras.layers.Dense(102)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_batches, validation_data=validation_batches, epochs=20)
Does not work at all, loss stays at 4.6250.
Third model:
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(IMG_SIZE, IMG_SIZE, 3)),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(102)
])
base_learning_rate = 0.0001
model.compile(optimizer=tf.optimizers.RMSprop(lr=base_learning_rate),
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_batches, validation_data=validation_batches, epochs=20)
Model overfits. Val_accuracy not above 0.15.
I added dropout layers to this model (trying differet rates) and also adjusted the kernels. However, no real improvement. Also tried adam optimizer.
Fourth model:
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(128, (3,3), activation='relu', input_shape=(IMG_SIZE, IMG_SIZE, 3)),
tf.keras.layers.Dropout(0.4),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(256, (3,3), activation='relu'),
tf.keras.layers.Dropout(0.4),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dropout(0.4),
tf.keras.layers.Dense(102)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_batches, validation_data=validation_batches, epochs=20)
Same problem again, no good val_accuracy. Also tried it with RMSprop optimizer. Not able to get a val_accuracy higher than 0.2.
Fifth model:
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(IMG_SIZE, IMG_SIZE, 3)),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(64, (2,2), activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(256, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(102)
])
base_learning_rate = 0.001
model.compile(optimizer=tf.optimizers.RMSprop(lr=base_learning_rate),
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_batches, validation_data=validation_batches, epochs=250)
val_accuracy at the highest around 0.3x. Also tried it with adam.
When I tried it with transfer learning, using Mobilenet I immediately got 0.7x within 10 epochs. So I wondered why I am not able to get close to this with a self-built CNN? I do not expect 0.8 or to beat Mobilenet. But where is my mistake? How would a self-built CNN look like with which I can get lets say 0.6-0.7 val_accuracy?