0
votes

I was trying to use Bidirectional LSTM, to classify text data (sentences) to certain classes. I used 3 of them as an example. I followed the multilabel-classification-post i.e. "Use sigmoid for activation of your output layer", "Use binary_crossentropy for loss function". I used an embedding layer (word vectors of size 300). my sentences are padded and truncated so that each sentence has 100 tokens. Here is the code for my model:

model = Sequential()

embedding_layer = Embedding(6695,
                        300,
                        weights=[embedding_matrix],

                    input_length=100,
                        trainable=True)

model.add(embedding_layer)
model.add(Bidirectional(LSTM(32, 
          return_sequences=False)))
model.add(Dense(3, 
          activation='sigmoid'))

model.compile(loss='binary_crossentropy',
          optimizer='rmsprop',
          metrics=['acc'])

print("model fitting - Bidirectional LSTM")
model.summary()

x= model.fit(X_train, y_train,
            batch_size=256,
            epochs=6,
            validation_data=(X_val, y_val),
            shuffle = True,
            verbose = 1
      )

here is the model summary, which is what expected: enter image description here

However, I got this error:

Traceback (most recent call last):
  File "/Users/master/Documents/Deep Learning/Learning Keras/reveiw_classification.py", line 159, in <module>
    verbose = 1
  File "/Users/master/.pyenv/versions/ENV4/lib/python3.6/site-packages/keras/engine/training.py", line 955, in fit
    batch_size=batch_size)
  File "/Users/master/.pyenv/versions/ENV4/lib/python3.6/site-packages/keras/engine/training.py", line 792, in _standardize_user_data
    exception_prefix='target')
  File "/Users/master/.pyenv/versions/ENV4/lib/python3.6/site-packages/keras/engine/training_utils.py", line 136, in standardize_input_data
    str(data_shape))
ValueError: Error when checking target: expected dense_1 to have shape (3,) but got array with shape (100,)

I do not need the LSTM to return a sequence of hidden state outputs, I just need the last output. I thought I used return_sequences=False in the LSTM, so that the output should have dimension 1, then a Bidirectional LSTM with 32 units, will have output dimension (None,64) as in the model summary. But why it says expected dense_1 to have shape (3,) but got array with shape (100,)? Could someone help me here?

1

1 Answers

1
votes

It looks like your targets y_train is actually the sentence and not a vector of labels [1, 0, 1] for example. The error is not about the model but the data you pass it.

  • Your y_train should be a 2D array of shape (num_samples, 3) so for each sample (sentence) you have a vector target of 3 labels.
  • X_train in this case would be something like (num_samples, 100) which is sentences of length 100 as your input.