5
votes

I am trying to do a binary classification using Fully Connected Layer architecture in Keras which is called as Dense class in Keras.

Here is the design of Neural Network architecture I created:

 from keras.models import Sequential
        from keras.layers import Dense, Dropout, Activation
        from keras.optimizers import SGD

        self.model = Sequential()
        # Dense(64) is a fully-connected layer with 64 hidden units.
        # in the first layer, you must specify the expected input data shape:
        # here, 20-dimensional vectors.
        self.model.add(Dense(32, activation='relu', input_dim=self.x_train_std.shape[1]))
        #self.model.add(Dropout(0.5))
        #self.model.add(Dense(64, activation='relu'))
        #self.model.add(Dropout(0.5))
        self.model.add(Dense(1, activation='sigmoid'))

So I have an input matrix of 17000,2000 which 17K samples with 2k features.

I have kept only one hidden layer with 32 units or neurons in that.

My output layer is a one neuron with sigmoid activation function.

Now when I try to check the weights of first hidden layer I am expecting it to be of size (2000,32) where each row is for each input and each column is for each unit in that layer.

Here is the config of the architecture setup by Keras:

dl_1.model.get_config()
Out[70]:
[{'class_name': 'Dense',
  'config': {'activation': 'relu',
   'activity_regularizer': None,
   'batch_input_shape': (None, 2000),
   'bias_constraint': None,
   'bias_initializer': {'class_name': 'Zeros', 'config': {}},
   'bias_regularizer': None,
   'dtype': 'float32',
   'kernel_constraint': None,
   'kernel_initializer': {'class_name': 'VarianceScaling',
    'config': {'distribution': 'uniform',
     'mode': 'fan_avg',
     'scale': 1.0,
     'seed': None}},
   'kernel_regularizer': None,
   'name': 'dense_1',
   'trainable': True,
   'units': 32,
   'use_bias': True}},
 {'class_name': 'Dense',
  'config': {'activation': 'sigmoid',
   'activity_regularizer': None,
   'bias_constraint': None,
   'bias_initializer': {'class_name': 'Zeros', 'config': {}},
   'bias_regularizer': None,
   'kernel_constraint': None,
   'kernel_initializer': {'class_name': 'VarianceScaling',
    'config': {'distribution': 'uniform',
     'mode': 'fan_avg',
     'scale': 1.0,
     'seed': None}},
   'kernel_regularizer': None,
   'name': 'dense_2',
   'trainable': True,
   'units': 1,
   'use_bias': True}}]

To see the dimension of first hidden layer:

dl_1.model.get_layer(name="dense_1").input_shape

(None, 2000)

Output size:

    dl_1.model.get_layer(name="dense_1").output_shape
Out[99]:
(None, 32)

So it does seem to give the (2000,32) which is as expected.

However when I try to check the weight matrix for this layer

dl_1.model.get_layer(name="dense_1").get_weights()[0]

It gives me a list of numpy arrays with list length being 2000 and array length inside that 32 like below:

array([[ 0.0484077 , -0.02401097, -0.03099879, -0.02864455, -0.01511723,
         0.01386002,  0.01127522,  0.00844895, -0.02420873,  0.04466306,
         0.02965425,  0.0410631 ,  0.02397312,  0.0038885 ,  0.04846045,
         0.00653989, -0.05288456, -0.00325713,  0.0445733 ,  0.04594839,
         0.02839083,  0.0445912 , -0.0140048 , -0.01198476,  0.05259909,
        -0.03752745, -0.01337494, -0.02162734, -0.01522341,  0.01208428,
         0.01122886,  0.01496441],
       [ 0.05225918,  0.04231448,  0.01388102, -0.03310467, -0.05293509,
         0.01130457,  0.03127011, -0.04250741, -0.04212657, -0.01595866,
        -0.002456  ,  0.01112743,  0.0150629 ,  0.03072598, -0.04061607,
        -0.01131565, -0.02259113,  0.00907649, -0.04728404, -0.00909081,
         0.03182121, -0.04608218, -0.04411709, -0.03561752,  0.04686243,
        -0.04555761,  0.04087613,  0.04380137,  0.02079088, -0.02390963,
        -0.0164928 , -0.01228274],

I am not sure I understand this. It should be 32 X2000 and not 2000 X 32. So I am expecting that since I have 32 units, and each unit has 2000 weights, the list would be 32 elements long and each element should be 2000 dimension numpy array. But it's reverse. I am not sure why is that?

The weights are associated with the hidden layer and not the input layer so if I think they showed it for input layer that doesn't make sense.

Any idea what is going on in this?

1
There is nothing going on, its just a matter of notation and convention. - Dr. Snoopy
Yeah I am understanding that. But isn't this a inconvenient convention? Think about it. When I say I want to check the weights of a layer, it basically means I want to find out for each unit in that layer what are the weights. It also gives me flexibility in indexing by unit (1, 2, 3,,) for a layer and then find out all the weights for that unit. But now, the way Keras has, it's by a unit of previous layer, which means - Baktaawar

1 Answers

5
votes

You are creating a Dense() layer of 32 units. Dense layers are (as your comment in the code indicates) "fully-connected layers", that means each feature in the data is connected to every layer. You also have 2000 features in your data elements.

Therefore the array you are getting has 2000 elements, one for each feature, and each one with 32 weights, one for each hidden layer, hence the shape you get.

From the keras docs we can see the example:

model = Sequential()
model.add(Dense(32, input_shape=(16,)))
# now the model will take as input arrays of shape (*, 16)
# and output arrays of shape (*, 32)

# after the first layer, you don't need to specify
# the size of the input anymore:
model.add(Dense(32))

In your case * is 2000 so your output weights should be of shape (2000,32) as you are getting. This seems to be the convention Keras uses for their outputs. Either way you could transform your data to give it other shapes, as a (N, M) array has the same number of elements as a (M, N) array.