3
votes

I built a convolutional neural network in Keras.

model.add(Convolution1D(nb_filter=111, filter_length=5, border_mode='valid', activation="relu", subsample_length=1))

According to the CS231 lecture a convolving operation creates a feature map (i.e. activation map) for each filter which are then stacked together. IN my case the convolutional layer has a 300 dimensional input. Hence, I expect the following computation:

  • Each filter has a window size of 5. Consequently, each filter produces 300-5+1=296 convolutions.
  • As there are 111 filters there should be a 111*296 output of the convolutional layer.

However, the actual output shapes look differently:

convolutional_layer = model.layers[1]
conv_weights, conv_biases = convolutional_layer.get_weights()

print(conv_weights.shape) # (5, 1, 300, 111)
print(conv_biases.shape)  # (,111)

The shape of the bias values makes sense, because there is one bias value for each filter. However, I do not understand the shape of the weights. Apparently, the first dimension depends on the filter size. The third dimension is the number of input neurons, which should have been reduced by the convolution. The last dimension probably refers to the number of filters. This does not make sense, because how should I easily get the feature map for a specific filter?

Keras either uses Theano or Tensorflow as a backend. According to their documentation the output of a convolving operation is a 4d tensor (batch_size, output_channel, output_rows, output_columns).

Can somebody explain me the output shape in accordance with the CS231 lecture?

2
Well... the actual output shape is not the weights shape. You can see the output shape when you create a model and make model.summary(). But, perhaps you've got inverted dimensions in the "input": (channels x 1d length) versus (1d length x channels). Try inverting the input, with "Reshape((1,300))" or "Reshape((300,1))" -- It will depend on whether your keras is configured for channels first or channels last. (Also, I don't know what the subsample_length=1 means, it's not on keras documentation, it seems).Daniel Möller

2 Answers

2
votes
  • Your Weight dimension has to be [filter_height, filter_width, in_channel, out_channe]
  • With your example I think the input channel which is the depth of the input is 300 and you want the output channel to be 111
  • Total number of filters are 111 and not 300*111
  • As you have said by yourself each bias for every filter so 111 bias for 111 filters
  • Each filter out of 111 will produce a convolution on the input
  • The Weight shape in your case means that you are using a kernel patch of shape 5*1
  • The third dimension means that depth of input feature map is 300
  • The fourth dimension mean that depth of the output feature map is 111
0
votes

Actually it makes very good sense. Your learn the weights of the filters. Each filter in turn produces an output (aka an activation map respective to your input data).

The first two axes of your conv_weights.shape are the dimensions of your filter that is being learned (as your already mentioned). Your filter_length is 5 x 1. Your input has 300 dimensions and you want to get 111 filters per dimension, so you end up with 300 * 111 filters of size 5 * 1 weights.

I assume that the feature map of filter #0 for dimension #0 is sth like your_weights[:, :, 0, 0].