I have a Keras model that with layers Embedding, LSTM, and Dropout, as well as the CRF implementation of keras_contrib.
I was trying to resume the training of a partly-trained model weights of which I had previously saved. However, when I tried loading a previously trained model via save_load_utils.load_all_weights of keras_contrib, I received the following error.
line 108, in load_all_weights:
model.optimizer.set_weights(optimizer_weight_values)
line 113, in set_weights:
'of the optimizer (' + str(len(params)) + ')')
ValueError: Length of the specified weight list (36) does not match the number of weights of the optimizer (0)model.optimizer.set_weights(optimizer_weight_values)
Apparently, the list of optimizer weights have the length 0. In keras implementation of the optimizers.py it is stated that set_weights "should only be called after computing the gradients, (otherwise the optimizer has no weights)."
I was wondering how to somehow manually initialize the optimizer weights so that the model weights I am trying to load can overwrite them. I thought of training the model for a single epoch with a dummy batch of size 1, but are there any other, more elegant ways to achieve this?
The entire code is on Github, but below is the model I trained, to provide a brief reference.
# Initialize vocab_size & embedding_weights
# Initialize C, U, N, M, H
model = Sequential()
embedding_layer = Embedding(vocab_size, N,
weights=[embedding_weights], mask_zero=True,
embeddings_regularizer=regularizers.l2(0.0001))
model.add(TimeDistributed(embedding_layer,
input_shape=(C, U)))
model.add(TimeDistributed(Bidirectional(LSTM(M // 2, return_sequences=True,
kernel_regularizer=regularizers.l2(0.0001)))))
model.add(TimeDistributed(Dropout(0.2)))
model.add(TimeDistributed(GlobalMaxPooling1D()))
model.add(Bidirectional(LSTM(H // 2, return_sequences = True,
kernel_regularizer=regularizers.l2(0.0001))))
model.add(Dropout(0.2))
crf = CRF(num_tags, sparse_target=False, kernel_regularizer=regularizers.l2(0.0001))
model.add(crf)
model.compile(optimizer, loss = crf.loss_function, metrics=[crf.accuracy])