1
votes

I'm using keras with tensorflow backend & have faced problem figuring out the right shapes for layers for my model.

I have already read this useful explanation on the difference of various keras layers' attributes.

This is the architecture of my model:

enter image description here

I'm trying to do a binary classification (logistic regression) using categorical labels & hence the last layer is a Dense layer with 1 unit which I assumed that would evaluate to 1 for positive class and 0 for negative class.

And this is my model's summary:

enter image description here

My input in one side of the net is 10158 & for the other-side is 20316. I have total 1370 samples. My train_data's shape is (1370, 1, 10158) & the label's shape is (1, 1370) & the batch size is 100.

input_layer = Input(shape=(1,no_terms), name='docs')
s = Lambda(lambda x: x+1)(input_layer)
log_layer = Lambda(log, name='tf_output')(input_layer)

tpr_fpr = np.zeros((2, no_terms))
tpr_fpr[0,:] = np.sum(train_docs[np.where(train_label>0), :]>0, axis=1
                      )/np.sum(train_label>0) * (1000)
tpr_fpr[1,:] = np.sum(train_docs[np.where(train_label>0), :]>0, axis=1
                     )/np.sum(train_label <= 0) * (1000)

k_constants = backend.constant(np.reshape(tpr_fpr.T, (1,2*no_terms)))
fixed_input = Input(tensor=k_constants, shape=(1, 2*no_terms), name='tpr_fpr')
h = Dense(int(300), activation='relu', name='hidden', input_shape=(1, 2*no_terms), 
          trainable=True)(fixed_input)
h = Dropout(0.2, name="D")(h)
cd = Dense(units=no_terms, activation='relu', name='cd', trainable=True)(h)


prod = Multiply()([log_layer, cd])
o = Lambda(lambda x:(x/backend.sqrt(backend.sum(x * x,axis=1,keepdims=True))))(prod)
o = ReLU(max_value=None, negative_slope=0.0, threshold=0.0)(o)
o = Dense(1, activation='sigmoid', input_shape=(no_terms,))(o)


model_const = Model(fixed_input,cd)
model = Model([input_layer, fixed_input], o)

op = optimizers.RMSprop(learning_rate=.1, rho=0.9)
model.compile(optimizer=op, loss=mean_squared_error, metrics=['accuracy'])
plot_model(model, to_file='model.png')
model.summary()
batchSize = 100

checkpoint = ModelCheckpoint(filepath="a.hdf5",monitor='val_acc', mode='max', 
                             save_best_only=True)
earlystop=EarlyStopping(monitor='val_loss', patience=20)

train_docs.shape = (train_docs.shape[0], 1, train_docs.shape[1])
train_label = to_categorical(train_label, num_classes=2, dtype='float32')
model.fit(train_docs, train_label, epochs=10, batch_size=batchSize, 
          validation_data=(test_docs, test_label), 
          callbacks=[earlystop, checkpoint], verbose=1)

And here's the error I got: "ValueError: Error when checking target: expected dense_1 to have 3 dimensions, but got array with shape (1430, 2)"

I have no Idea what is the what the (1430, 2) shape refers to & why I got this error.

3
Presumably something is wrong with your lambda layer.The Guy with The Hat
but got array with shape ... is the data that's being fed in, not the shape of the model - seems like an error with your data to meMohamad Zeina
To better assist, the code of your model and data would help more than the images (which you could remove entirely).OverLordGoldDragon
@OverLordGoldDragon I've added the previous code (the one that had problem) & the solution code which at this point I'm not sure is a true fix or just avoids errors. I would be thankful if you look these codes & explain to me why the previous structure didn't work. Thanks.Ehsan
@TheGuywithTheHat The lambda layer was fine.the problem was with the shapes.Ehsan

3 Answers

1
votes

You did indeed directly solve the problem - here's how:

  • Keras binary classification expects labels ('targets') shaped as (batch_size, 1). Reason: the goal of the final layer is to output predictions, which will be compared against labels to compute metrics (loss, accuracy, etc) - and labels are shaped (batch_size, 1)
  • Above is also why to_categorical was a problem - see snippet from docs below; for binary classification, one-hot encoding is redundant, as binary_crossentropy directly compares labels against predictions as supplied
  • Keras Dense expects inputs to be 2D: (batch_size, input_dim). Your reshaping was making the input 3D: (batch_size, 1, input_dim)
  • Above is also why shape=(1, no_terms) --> shape=(no_terms,) helped; both are, in fact, correct for the data shapes you were feeding at the time. The full batch shape simply includes the batch dim: (batch_size, no_terms) (no_terms == input_dim)
  • Lastly, for binary classification, use loss='binary_crossentropy' - and never mean square error for classification problems (unless for very specific reasons)
# Consider an array of 5 labels out of a set of 3 classes {0, 1, 2}:
> labels
array([0, 2, 1, 2, 0])
# `to_categorical` converts this into a matrix with as many
# columns as there are classes. The number of rows
# stays the same.
> to_categorical(labels)
array([[ 1.,  0.,  0.],
       [ 0.,  0.,  1.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.],
       [ 1.,  0.,  0.]], dtype=float32)
0
votes

Well I found the solution to the error but still can't understand why that previous shape didn't work out & quite frankly this fix was a success among numerous trial & error.

I changed the following layers input from the shape format (1, x) to the format of (x,):

    input_layer = Input(shape=(no_terms,), name='docs')

    k_constants = backend.constant(np.reshape(tpr_fpr.T, (1,2*no_terms)))
    fixed_input = Input(tensor=k_constants, shape=(2*no_terms,), name='tpr_fpr')
    h = Dense(int(300), activation='relu', name='hidden', input_shape=(2*no_terms,), trainable=True)(fixed_input)

    o = ReLU(max_value=None, negative_slope=0.0, threshold=0.0)(o)
    o = Dense(1, activation='sigmoid', input_shape=(no_terms,))(o)

and also removed the following lines from the code:

    train_docs.shape = (train_docs.shape[0], 1, train_docs.shape[1])
    train_label = to_categorical(train_label, num_classes=2, dtype='float32')

Now I'm just using a label of shape (#no_of_samples, 1) which is binary & not a categorical label.

So the new structure is: enter image description here

I hope somebody can explain what was wrong with the previous model, so I would avoid making the same mistake again.

Thanks.

0
votes

checking target: expected dense_1 to have 3 dimensions, but got array with shape (1430, 2)"

That means dense_1 have 3 dimensions but your input have only 2 dimensions if your design this model in Image processing in that case you have declare the image shape yhat is ((48,48),1) & ((48,48),3) here 1 is for gray scale & 3 is for rgb images