Input 0 of layer conv2d is incompatible with layer: expected axis -1 of input shape to have value 1 but received input with shape [None, 64, 64, 3]

Question

I'm running a model on EMNIST (128x128 gray-scale images) and I'm having trouble with understanding how to properly load data into Tensorflow for modeling.

I was following the flower example provided by TensorFlow (https://www.tensorflow.org/hub/tutorials/image_feature_vector) except for the CNN structure until suddenly model.fit() failed with the error
Input 0 of layer conv2d_120 is incompatible with the layer: expected axis -1 of input shape to have value 1 but received input with shape [None, 64, 64, 3]

Loading the Dataset

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

batch_size = 32
image_w = 64
image_h = 64
seed = 123

data_dir = 'B:\Datasets\EMNIST Digital Number & Digits\OriginalDigits'

train_df = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=seed,
  image_size=(image_w,image_h),
  batch_size=batch_size)

val_df = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation", #Same exact code block ... this is the only line of difference
  seed=seed,
  image_size=(image_w,image_h),
  batch_size=batch_size)

Found 10160 files belonging to 10 classes.
Using 8128 files for training.
Found 10160 files belonging to 10 classes.
Using 2032 files for validation.

Confirmation that the data loaded correctly

import matplotlib.pyplot as plt
plt.figure(figsize=(10, 10))
for images, labels in train_df.take(1): #Take subsets the dataset into at most __1__ element (Seems to randomly create it)
    for i in range(9):
        ax = plt.subplot(3, 3, i + 1)
        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(labels[i].numpy().astype("str"))
        plt.axis("off")

Processing the dataset into tf.data.Dataset object

class_labels = train_df.class_names
num_classes = len(class_labels)
print(class_labels,num_classes)

['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'] 10

AUTOTUNE = tf.data.experimental.AUTOTUNE

train_df_modeling = train_df.cache().shuffle(len(train_df)) #Load training data into memory cache + shuffle all 10160 images
val_df_modeling = val_df.cache().shuffle(len(train_df)) #Load validation data into memory cache

Define model

#Model from https://www.kaggle.com/henseljahja/simple-tensorflow-cnn-98-8
model = keras.models.Sequential([

    layers.experimental.preprocessing.Rescaling(1./255, input_shape=(image_h, image_w, 1)), #(64,64,1)
    layers.Conv2D(64, 7, padding='same', activation='relu'),    
    layers.GaussianNoise(0.2),
    layers.MaxPooling2D(pool_size=2),
    layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding="SAME"),
    layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding="SAME"),
    layers.MaxPooling2D(pool_size=2),
    layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding="SAME"),
    layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding="SAME"),
    layers.MaxPooling2D(pool_size=2),
    layers.Flatten(),
    layers.Dense(units=256, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(units=128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(units=64, activation='relu'),
    layers.Dropout(0.5),    
    keras.layers.Dense(num_classes, activation='softmax'), #10 outputs [0,1,2,3,4,5,6,7,8,9]
])

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
rescaling (Rescaling) (None, 64, 64, 1) 0
_________________________________________________________________
conv2d (Conv2D) (None, 64, 64, 64) 640
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 32, 32, 64) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 32, 32, 128) 73856
_________________________________________________________________
conv2d_2 (Conv2D) (None, 32, 32, 128) 147584
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 16, 16, 128) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 16, 16, 128) 147584
_________________________________________________________________
conv2d_4 (Conv2D) (None, 16, 16, 128) 147584
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 8, 8, 128) 0
_________________________________________________________________
flatten (Flatten) (None, 8192) 0
_________________________________________________________________
dense (Dense) (None, 256) 2097408
_________________________________________________________________
dropout (Dropout) (None, 256) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 32896
_________________________________________________________________
dropout_1 (Dropout) (None, 128) 0
_________________________________________________________________
dense_2 (Dense) (None, 64) 8256
_________________________________________________________________
dropout_2 (Dropout) (None, 64) 0
_________________________________________________________________
dense_3 (Dense) (None, 10) 650
=================================================================
Total params: 2,656,458
Trainable params: 2,656,458
Non-trainable params: 0

Training the model

model.compile(
    loss="sparse_categorical_crossentropy",
    optimizer = 'nadam',
    metrics=['accuracy']
)

result = model.fit(train_df_modeling,
                   validation_data=val_df_modeling,
                   epochs=20,
                   verbose=1)

ValueError: Input 0 of layer conv2d is incompatible with the layer: expected axis -1 of input shape to have value 1 but received input with shape [None, 64, 64, 3]

I understand that my problem is related to shape and that [None, 64, 64, 3] is [batch_size, width, height, channels] but I have the following questions:

Why does it expect the input shape to have value 1? Shouldn't the Conv2D layer be expecting an image?
Why does my input have 3 channels? I told it the input has only 1 channel.
Note: Attempting to remove the rescale layer and simply have Conv2D be the initial layer still gives this same error message of expecting value 1 but got 64x64x3

Shyu Shyu · Accepted Answer · 2021-05-16T05:14:36

Well ... in the midst of typing the last section about questions I had, I came about the solution at question #2.

My data (although it is gray-scale data) was being read by Tensorflow as RGB because I never specified.

Solution

Read data in as gray-scale

Documentation: https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image_dataset_from_directory

Argument of interest: color_mode='grayscale'

Modification to my code to get it working:

Only needed to change 1 block of code (2 variables)

data_dir = 'B:\Datasets\EMNIST Digital Number & Digits\OriginalDigits'

train_df = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=seed,
  image_size=(image_w,image_h),
  batch_size=batch_size,
  color_mode='grayscale') #<---- This is was the missing link

val_df = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=seed,
  image_size=(image_w,image_h),
  batch_size=batch_size,
  color_mode='grayscale') #<---- This is was the missing link

Though this solution fixes the model & allows the code to execute ... can anyone answer question #1? I'm still curious why it believed it needed an input to have value 1 when I believe the input should have been an image.