I'm running a model on EMNIST (128x128 gray-scale images) and I'm having trouble with understanding how to properly load data into Tensorflow for modeling.
I was following the flower example provided by TensorFlow (https://www.tensorflow.org/hub/tutorials/image_feature_vector) except for the CNN structure until suddenly model.fit() failed with the error Input 0 of layer conv2d_120 is incompatible with the layer: expected axis -1 of input shape to have value 1 but received input with shape [None, 64, 64, 3]
Loading the Dataset
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
batch_size = 32
image_w = 64
image_h = 64
seed = 123
data_dir = 'B:\Datasets\EMNIST Digital Number & Digits\OriginalDigits'
train_df = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=seed,
image_size=(image_w,image_h),
batch_size=batch_size)
val_df = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation", #Same exact code block ... this is the only line of difference
seed=seed,
image_size=(image_w,image_h),
batch_size=batch_size)
Found 10160 files belonging to 10 classes.
Using 8128 files for training.
Found 10160 files belonging to 10 classes.
Using 2032 files for validation.
Confirmation that the data loaded correctly
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 10))
for images, labels in train_df.take(1): #Take subsets the dataset into at most __1__ element (Seems to randomly create it)
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(labels[i].numpy().astype("str"))
plt.axis("off")
Processing the dataset into tf.data.Dataset object
class_labels = train_df.class_names
num_classes = len(class_labels)
print(class_labels,num_classes)
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'] 10
AUTOTUNE = tf.data.experimental.AUTOTUNE
train_df_modeling = train_df.cache().shuffle(len(train_df)) #Load training data into memory cache + shuffle all 10160 images
val_df_modeling = val_df.cache().shuffle(len(train_df)) #Load validation data into memory cache
Define model
#Model from https://www.kaggle.com/henseljahja/simple-tensorflow-cnn-98-8
model = keras.models.Sequential([
layers.experimental.preprocessing.Rescaling(1./255, input_shape=(image_h, image_w, 1)), #(64,64,1)
layers.Conv2D(64, 7, padding='same', activation='relu'),
layers.GaussianNoise(0.2),
layers.MaxPooling2D(pool_size=2),
layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding="SAME"),
layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding="SAME"),
layers.MaxPooling2D(pool_size=2),
layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding="SAME"),
layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding="SAME"),
layers.MaxPooling2D(pool_size=2),
layers.Flatten(),
layers.Dense(units=256, activation='relu'),
layers.Dropout(0.5),
layers.Dense(units=128, activation='relu'),
layers.Dropout(0.5),
layers.Dense(units=64, activation='relu'),
layers.Dropout(0.5),
keras.layers.Dense(num_classes, activation='softmax'), #10 outputs [0,1,2,3,4,5,6,7,8,9]
])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
rescaling (Rescaling) (None, 64, 64, 1) 0
_________________________________________________________________
conv2d (Conv2D) (None, 64, 64, 64) 640
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 32, 32, 64) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 32, 32, 128) 73856
_________________________________________________________________
conv2d_2 (Conv2D) (None, 32, 32, 128) 147584
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 16, 16, 128) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 16, 16, 128) 147584
_________________________________________________________________
conv2d_4 (Conv2D) (None, 16, 16, 128) 147584
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 8, 8, 128) 0
_________________________________________________________________
flatten (Flatten) (None, 8192) 0
_________________________________________________________________
dense (Dense) (None, 256) 2097408
_________________________________________________________________
dropout (Dropout) (None, 256) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 32896
_________________________________________________________________
dropout_1 (Dropout) (None, 128) 0
_________________________________________________________________
dense_2 (Dense) (None, 64) 8256
_________________________________________________________________
dropout_2 (Dropout) (None, 64) 0
_________________________________________________________________
dense_3 (Dense) (None, 10) 650
=================================================================
Total params: 2,656,458
Trainable params: 2,656,458
Non-trainable params: 0
Training the model
model.compile(
loss="sparse_categorical_crossentropy",
optimizer = 'nadam',
metrics=['accuracy']
)
result = model.fit(train_df_modeling,
validation_data=val_df_modeling,
epochs=20,
verbose=1)
ValueError: Input 0 of layer conv2d is incompatible with the layer: expected axis -1 of input shape to have value 1 but received input with shape [None, 64, 64, 3]
I understand that my problem is related to shape and that [None, 64, 64, 3] is [batch_size, width, height, channels] but I have the following questions:
- Why does it expect the input shape to
have value 1? Shouldn't the Conv2D layer be expecting an image? - Why does my input have 3 channels? I told it the input has only 1 channel.
Note: Attempting to remove the rescale layer and simply have Conv2D be the initial layer still gives this same error message of expecting value 1 but got 64x64x3