0
votes

I'm trying to use a simple character-level Keras model for extract key text from a sentence.

I feed it x_train a padded sequence of dim (n_examples, 500) representing the entire sentence and y_train, a padded sequence of dim (n_examples, 100) representing the import text to extract.

I try a simple model like such:

vocab_size = 1000
src_txt_length = 500
sum_txt_length = 100
inputs = Input(shape=(src_txt_length,))

encoder1 = Embedding(vocab_size, 128)(inputs)
encoder2 = LSTM(128)(encoder1)
encoder3 = RepeatVector(sum_txt_length)(encoder2)

decoder1 = LSTM(128, return_sequences=True)(encoder3)
outputs = TimeDistributed(Dense(100, activation='softmax'))(decoder1)

model = Model(inputs=inputs, outputs=outputs)
model.compile(loss='categorical_crossentropy', optimizer='adam')

When I try to train it with the following code:

hist = model.fit(x_train, y_train, verbose=1, validation_data=(x_test, y_test), batch_size=batch_size, epochs=5)

I get the error:

ValueError: Error when checking target: expected time_distributed_27 to have 3 dimensions, but got array with shape (28500, 100)

My question is: I have the return_sequences parameter set to True on the last LSTM layer, but the Dense fully-connected layer is telling me that the input is 2-dimensional.

What am I doing wrong here? Any help would be greatly appreciated!

1
Have you tried to just split TimeDistributed and Dense in two lines ?ixeption
Unfortunately yes. Even if I remove the TimeDistributed layer and run it, I get a similar error: ValueError: Error when checking target: expected dense_43 to have 3 dimensions, but got array with shape (28500, 100)Steven

1 Answers

1
votes

It isn't complaining about the input to TimeDistributed but the target y_train.shape == (n_examples, 100) which isn't 3D. You have a mismatch between predicting a sequence and a single point. In other words, outputs is 3D but y_train is 2D.