I'm currently working with transfer learning on a sensor based activity dataset and I tried out two methods to transfer a model that was trained on another dataset before.
The first way of transfering was to load the trained model, cut off the last dense and softmax classification layer, add a new dense layer and softmax layer (that corresponds to the number of new classes), freeze every layer except the newly added and fit the model on the new dataset. This resulted in a F1-Score of 30%.
The second way to transfer the model was to initialize a new model, based on the new dataset, freeze every layer, except the last ones, transfer only the weights from the loaded model to the newly initialized and train the model. This resulted in a F1-Score of more or less 90%.
So right now, I'm trying to figure out what exactly is the difference between these two approaches to transfer a model. The second approach in the end is just a new model, where the weights have been initialized with already trained ones and not with weights that are coming from an initializer function (glorot_uniform, lecun_uniform, ...), right? For my understanding of transfer learning, this is also the correct approach. As for as I understood the concept, in transfer learning, you only reuse the weights and not the whole model.
Still I'm wondering what else influenced the training of the first approach so badly, that it resulted in only 30% F1?
thanks and best regards.