0
votes

I'm attempting to train a convolutional auto-encoder to recreate images that are 80X130

I've added all the necessary imports, and I'm writing it in python 3.7

This is the error I get:

Traceback (most recent call last):

File "CAED_Keras.py", line 52, in ", line 52, in validation_data=(x_train, x_train)) , in fit

File "C:\Python37\lib\site-packages\keras\engine\training.py", line 1154, in fit in _standardize_user_data batch_size=batch_size)

File "C:\Python37\lib\site-packages\keras\engine\training.py", line 621,e 145, in standardize_input_data in _standardize_user_data exception_prefix='target') , 76, 1) but got array with shape (1, 80, 130)

File "C:\Python37\lib\site-packages\keras\engine\training_utils.py", line 145, in standardize_input_data str(data_shape))

ValueError: Error when checking target: expected conv2d_7 to have shape (4, 76, 1) but got array with shape (1, 80, 130)

Here is my code:

input_img = Input(shape=(80, 130, 1)) 

x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

#Load Data
x_train = np.copy(lt.mel_spect_out[:int(len(lt.mel_spect_out)/10*9)])
x_test = np.copy(lt.mel_spect_out[int(len(lt.mel_spect_out)/10*9):])

#normalize
x_train = x_train / 255.
x_test = x_test / 255.


x_train = np.reshape(x_train, (len(x_train), 80, 130, 1))  
x_test = np.reshape(x_test, (len(x_test), 80, 130, 1))  

autoencoder.fit(x_train, x_train,
            epochs=50,
            batch_size=30,
            shuffle=True,
            validation_data=(x_train, x_train))
1

1 Answers

0
votes

You have a few problems:

The network's architecture

I changed a few things: first, I added an UpSampling layer after the encoded one and removed the final one to conserve symmetry. Then, I added padding='same' to the last 16-filters conv layer to prevent your data from being "cropped".

Here is the resulting code:

x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

x = UpSampling2D((2, 2))(encoded)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
decoded = Conv2D(1, (3, 3), activation='relu', padding='same')(x)

The input shape

The problem here is that you want to downsample your data 3 times.

No problem for the first dimension: 80 % 2^3 = 0.

But for the second one: 130 % 2^3 = 2, this is problematic.

What can you do?

  • You could crop your images' heights to 128
  • I have no other idea at the moment

One last note

You used

validation_data=(x_train, x_train)