1
votes

What I mean is this: Let say there are 64 channels output from the Conv kernel, I would like only to train the channel 1 out of 64 channels.

The question comes from reading this paper https://arxiv.org/abs/1911.09659.

In this paper, it suggests we could freeze some filter and train the rest in the transfer learning.

However, I am wondering how to achieve it in Tensorflow.

If we are able to freeze some layers, it is quite obvious, just iterate the layers in the network and make their trainable boolean value to be False.

However, when it comes to kernels, it would be troublesome I than I think.

I have found this anwser Get the value of some weights in a model trained by TensorFlow.

I tried to get all the weights and put them like this:

def get_weights_bias(model, layer_name):
"""
This function aims to extract kernel and its bias from the original weight
:param model: the model we want to extract weight from
:param layer_name: the name of the layer we want to extract the weight from
:return: kernel_list and bias_list of particular layer
"""
for layer in model.layers:
    if layer_name == layer.name:
        weights = layer.get_weights()
        print(type(weights))
        print(weights[0].shape, type(weights[0])) # weights
        print(weights[1].shape, type(weights[1])) # biases
        kernel_list = []
        bias_list = weights[1]
        print(type(bias_list))
        for j in range(weights[0].shape[-1]):
            name_weight = layer_name + "_weight_" + str(j)
            kernel = tf.Variable(initial_value=weights[0][:, :, :, j], name=name_weight, trainable=True)
            kernel = tf.expand_dims(kernel, -1)
            kernel_list.append(kernel)
return kernel_list, bias_list

Suggested by this answer, I faced some problems. I found it is hard to restore them back into the conv layer as layer.set_weights() only accepts numpy array instead of tf.Variable.

Any suggestions??

1

1 Answers

1
votes

As far as I know, there is not a clean way of freezing just the kernel and keeping the bias trainable using the keras API. Nonetheless, here is how you could do it by using private attributes:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, Dense

# initialize the model
m = Sequential([Conv2D(1,(3,3), input_shape=(3,3,1)),Dense(1,)])

for layer in m.layers:
    # pop the kernel from the trainable weights
    kernel_w = layer._trainable_weights.pop(0)
    # add it to the non trainable weights
    layer._non_trainable_weights.append(kernel_w)

Checking that it works

A little experiment to see if the trick worked: lets look look at the kernel and bias of the Dense layer:

>>> m.layers[1].kernel
<tf.Variable 'dense/kernel:0' shape=(1, 1) dtype=float32, numpy=array([[0.586058]], dtype=float32)>
>>> m.layers[1].bias
<tf.Variable 'dense/bias:0' shape=(1,) dtype=float32, numpy=array([0.], dtype=float32)>

Then we compile our model and train it on dummy data :

import numpy as np
# random data
X = np.random.random((100,3,3,1))
y = 5 + X[:,0,0,0] * 34 
# training
m.compile(loss='mse')

And if we check again our weights :

>>> m.layers[1].kernel
<tf.Variable 'dense/kernel:0' shape=(1, 1) dtype=float32, numpy=array([[0.586058]], dtype=float32)>
>>> m.layers[1].bias
<tf.Variable 'dense/bias:0' shape=(1,) dtype=float32, numpy=array([0.00957833], dtype=float32)>

The bias changed, but not the kernel!


A note of caution

You have to be careful with this method : the kernel might not always be at the index 0 of the layer. Check the layers you want to use and the index of their kernel. As far as I tested, it works for Dense and Conv2D layers.

Note also that this method relies on private attributes of the Layer class. It works for now, but it might break one day.