I am very new to NN and tensorflow, recently I have been reading up on keras implementation of variational autoencoder, and I found two versions of loss functions:
version1:
def vae_loss(x, x_decoded_mean):
recon_loss = original_dim * objectives.mse(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
return recon_loss + kl_loss
version2:
def vae_loss(x, x_decoded_mean):
recon_loss = objectives.mse(x, x_decoded_mean)
kl_loss = - 0.5 * K.mean(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
return recon_loss + kl_loss
if my understanding is correct, version 1 is a sum of loss and version 2 is mean loss across all samples in the same batch. so does the scale of loss affect learning result? I tried testing them out, and it largely affect my latent variable scale. so why is this and which form of loss function is correct?
update of my question: if I multiply original_dim with KL loss,
def vae_loss(x, x_decoded_mean):
xent_loss = original_dim * objectives.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1) *original_dim
return xent_loss + kl_loss
the latent distribution looks like below: enter image description here
and decoded output looks like this: enter image description here
looks the encoder output does not contain any information. I am using mnist dataset, and the example from https://github.com/vvkv/Variational-Auto-Encoders/blob/master/Variational%2BAuto%2BEncoders.ipynb