6
votes

Could someone please explain to me why the autoencoder is not converging? To me the results of the two networks below should be the same. However, the autoencoder below is not converging, whereas, the network beneath it is.

# autoencoder implementation, does not converge
autoencoder = Sequential()
encoder = containers.Sequential([Dense(32,16,activation='tanh')]) 
decoder = containers.Sequential([Dense(16,32)])
autoencoder.add(AutoEncoder(encoder=encoder, decoder=decoder, 
                        output_reconstruction=True))
rms = RMSprop()
autoencoder.compile(loss='mean_squared_error', optimizer=rms)

autoencoder.fit(trainData,trainData, nb_epoch=20, batch_size=64,
            validation_data=(testData, testData), show_accuracy=False)

 # non-autoencoder implementation, converges

model = Sequential()
model.add(Dense(32,16,activation='tanh')) 
model.add(Dense(16,32))
model.compile(loss='mean_squared_error', optimizer=rms)

model.fit(trainData,trainData, nb_epoch=numEpochs, batch_size=batch_size,
            validation_data=(testData, testData), show_accuracy=False)
2

2 Answers

2
votes

I think Keras's Autoencoder implementation ties the weights of the encoder and decoder, whereas in your implementation, the encoder and decoder have separate weights. If your implementation is leading to much better performance on the test data, then it may indicate that un-tied weights may be needed for your problem.

2
votes

The new version (0.3.0) of Keras no longer has tied weights in AutoEncoder, and it still shows different convergence. This is because weights are initialized differently.

In the non-AE example, Dense(32,16) weights are initialized first, followed by Dense(16,32). In the AE example, Dense(32,16) weights are initialized first, followed by Dense(16,32), and then when you create the AutoEncoder instance, Dense(32,16) weights are initialized again (self.encoder.set_previous(node) will call build() to initialize weights).

Now the following two NNs converge exactly the same:

autoencoder = Sequential()
encoder = containers.Sequential([Dense(32,16,activation='tanh')]) 
decoder = containers.Sequential([Dense(16,32)])
autoencoder.add(AutoEncoder(encoder=encoder, decoder=decoder, 
                        output_reconstruction=True))
rms = RMSprop()
autoencoder.compile(loss='mean_squared_error', optimizer=rms)
np.random.seed(0)
autoencoder.fit(trainData,trainData, nb_epoch=20, batch_size=64,
            validation_data=(testData, testData), show_accuracy=False)

# non-autoencoder
model = Sequential()
model.add(Dense(32,16,activation='tanh')) 
model.add(Dense(16,32))
model.set_weights(autoencoder.get_weights())
model.compile(loss='mean_squared_error', optimizer=rms)
np.random.seed(0)
model.fit(trainData,trainData, nb_epoch=numEpochs, batch_size=batch_size,
            validation_data=(testData, testData), show_accuracy=False)