
I need to initialize custom Conv2D kernels with weights

W = a1b1 + a2b2 + ... + anbn

where W = custom weight of Conv2D layer to initialise with

a = random weight Tensors as keras.backend.variable(np.random.uniform()), shape=(64, 1, 10)

b = fixed basis filters defined as keras.backend.constant(...), shape=(10, 11, 11)

W = K.sum(a[:, :, :, None, None] * b[None, None, :, :, :], axis=2) #shape=(64, 1, 11, 11)

I want my model to update the 'W' values with only changing the 'a's while keeping the 'b's constant.

I pass the custom 'W's as

Conv2D(64, kernel_size=(11, 11), activation='relu', kernel_initializer=kernel_init_L1)(img)

where kernel_init_L1 returns keras.backend.variable(K.reshape(w_L1, (11, 11, 1, 64)))

Problem: I am not sure if this is the correct way to do this. Is it possible to specify in Keras which ones are trainable and which are not. I know that layers can be set trainable = True but i am not sure about weights.

I think the implementation is incorrect because I get similar results from my model with or without the custom initializations.

It would be immensely helpful if someone can point out any mistakes in my approach or provide a way to verify it.

But, does my approach mean that only the 'a' values are updated? How do I verify this?Anakin
Then how can I control which parameters are updated in Keras? How can I stop the 'b' values from changing, so that the changes in 'W' are only due to change in 'a'.Anakin
As far as I know, only layers can be set trainable=FalseAnakin
I thought about it again, and I removed my previous comments. You're right, only "layers" can be trainable/untrainable. To control what are weights, you must create a custom layer and implement a custom "build" method.Daniel Möller

1 Answers


Warning about your shapes: If your kernel size is (11,11), and assuming you have 64 input channels and 1 output channel, your final kernel shape must be (11,11,64,1).

You should probably be going for a[None,None] and b[:,:,:,None,None].

class CustomConv2D(Conv2D):

    def __init__(self, filters, kernel_size, kernelB = None, **kwargs):
        super(CustomConv2D, self).__init__(filters, kernel_size,**kwargs)
        self.kernelB = kernelB

    def build(self, input_shape):
        #use the input_shape to calculate the shapes of A and B
        #if needed, pay attention to the "data_format" used. 

        #this is an actual weight, because it uses `self.add_weight`   
        self.kernelA = self.add_weight(
                  shape=shape_of_kernel_A + (1,1), #or (1,1) + shape_of_A
                  initializer='glorot_uniform', #or select another

        #this is an ordinary var that will participate in the calculation
            #not a weight, not updated
        if self.kernelB is None:
            self.kernelB = K.constant(....) 
            #use the shape already containing the new axes

        #in the original conv layer, this property would be the actual kernel,
        #now it's just a var that will be used in the original's "call" method 
        self.kernel = K.sum(self.kernelA * self.kernelB, axis=2)  
        #important: the resulting shape should be:
            #(kernelSizeX, kernelSizeY, input_channels, output_channels)   

        #the following are remains of the original code for "build" in Conv2D
        #use_bias is True by default
        if self.use_bias:
            self.bias = self.add_weight(shape=(self.filters,),
            self.bias = None
        # Set input spec.
        self.input_spec = InputSpec(ndim=self.rank + 2,
                                axes={channel_axis: input_dim})
        self.built = True

Hints for custom layers

When you create a custom layer from zero (derived from Layer), you should have these methods:

  • __init__(self, ... parameters ...) - this is the creator, it's called when you create a new instance of your layer. Here, you store the values the user passed as parameters. (In a Conv2D, the init would have the "filters", "kernel_size", etc.)
  • build(self, input_shape) - this is where you should create the weights (all learnable vars are created here, based on the input shape)
  • compute_output_shape(self,input_shape) - here you return the output shape based on the input shape
  • call(self,inputs) - Here you perform the actual layer calculations

Since we're not creating this layer from zero, but deriving it from Conv2D, everything is ready, all we did was to "change" the build method and replace what would be considered the kernel of the Conv2D layer.

More on custom layers: https://keras.io/layers/writing-your-own-keras-layers/

The call method for conv layers is here in class _Conv(Layer):.