0
votes

I am new to deep learning and keras. I am trying to train a Unet with perceptual loss using keras. I have a problem with the output image color. My input image is color image(RGB).

If I don't preprocess the input image, which means the input is RGB with 0~255. The output is as below: output image(RGB with 0~255) It is darker than the label image.

And I found that the pretrained vgg16 model is using "caffe" weights. And the function keras.applications.vgg16.preprocess_input will change the RGB to BGR and substract the mean value. So I tried to use keras.applications.vgg16.preprocess_input and then deprocess output images by add the mean value then change back to RGB. However the output images are too white: output image(vgg16.preprocess_input)

Then I add MSE loss with the ratio -> 10:1 (perceptual loss: MSE) The output is not different as output image(vgg16.preprocess_input)

I want to know that is this a common problem with perceptual loss or there are something wrong in my code?

Here is my code

preprocess image:

img = load_img(datapath, grayscale = False)
img = img.resize( (resize_size, resize_size), Image.BILINEAR )
img = img_to_array(img)
img = preprocess_input(img)

deprocess image:

mean = [103.939, 116.779, 123.68]
img[..., 0] += mean[0]
img[..., 1] += mean[1]
img[..., 2] += mean[2]
img = img[..., ::-1]

Perceptual loss:

def perceptual_loss(y_true, y_pred): 
    vgg = VGG16(include_top=False, weights='imagenet', input_shape=(resize_size, resize_size, 3)) 
    loss_model = Model(inputs=vgg.input, 
    outputs=vgg.get_layer('block3_conv3').output) 
    loss_model.trainable = False
    return K.mean(K.square(loss_model(y_true) - loss_model(y_pred)))

If you have any ideas, please tell me. Thanks a lot!!!

1

1 Answers

1
votes

The output of "your" model is not at all related with anything regarding VGG, caffe, etc.

It's "you" who define it when you create your model.

So, if your model's outputs must be between 0 and 255, one possibility is to have its last layers as:

Activation('sigmoid')   
Lambda(lambda x: x*255)

You'd then need the preprocess_input function inside perceptual loss:

def perceptual_loss(y_true, y_pred): 
    y_true = preprocess_input(y_true)
    y_pred = preprocess_input(y_pred)
    vgg = VGG16(include_top=False, weights='imagenet', input_shape=(resize_size, resize_size, 3)) 
    loss_model = Model(inputs=vgg.input, 
    outputs=vgg.get_layer('block3_conv3').output) 
    loss_model.trainable = False
    return K.mean(K.square(loss_model(y_true) - loss_model(y_pred)))

Another possibility is to postprocess your model's output. (But again, the range of the output is totally defined by you).