0
votes

I'm trying to understand the keras LSTM layer a bit better in regards to timesteps, but am still struggling a bit.

I want to create a model that is able to compare 2 inputs (siamese network). So my input is twice a preprocessed text. The preprocessing is done as followed:

max_len = 64
data['cleaned_text_1'] = assets.apply(lambda x: clean_string(data[]), axis=1)
data['text_1_seq'] = t.texts_to_sequences(cleaned_text_1.astype(str).values)
data['text_1_seq_pad'] = [list(x) for x in pad_sequences(assets['text_1_seq'], maxlen=max_len, padding='post')]

same is being done for the second text input. T is from keras.preprocessing.text.Tokenizer.

I defined the model with:

common_embed = Embedding(
    name="synopsis_embedd",
    input_dim=len(t.word_index)+1,
    output_dim=300,
    input_length=len(data['text_1_seq_pad'].tolist()[0]),
    trainable=True
)

lstm_layer = tf.keras.layers.Bidirectional(
    tf.keras.layers.LSTM(32, dropout=0.2, recurrent_dropout=0.2)
)

input1 = tf.keras.Input(shape=(len(data['text_1_seq_pad'].tolist()[0]),))
e1 = common_embed(input1)
x1 = lstm_layer(e1)

input2 = tf.keras.Input(shape=(len(data['text_1_seq_pad'].tolist()[0]),))
e2 = common_embed(input2)
x2 = lstm_layer(e2)

merged = tf.keras.layers.Lambda(
    function=l1_distance, output_shape=l1_dist_output_shape, name='L1_distance'
)([x1, x2])

conc = Concatenate(axis=-1)([merged, x1, x2])

x = Dropout(0.01)(conc)
preds = tf.keras.layers.Dense(1, activation='sigmoid')(x)
model = tf.keras.Model(inputs=[input1, input2], outputs=preds)

that seems to work if I feed the numpy data with the fit method:

model.fit(
    x = [np.array(data['text_1_seq_pad'].tolist()), np.array(data['text_2_seq_pad'].tolist())],
    y = y_train.values.reshape(-1,1), 
    epochs=epochs,
    batch_size=batch_size,
    validation_data=([np.array(val['text_1_seq_pad'].tolist()), np.array(val['text_2_seq_pad'].tolist())], y_val.values.reshape(-1,1)),
)

. What I'm trying to understand at the moment is what is the shape in my case for the LSTM layer for:

  • samples
  • time_steps
  • features

Is it correct that the input_shape for the LSTM layer would be input_shape=(300,1) because I set the embedding output dim to 300 and I have only 1 input feature per LSTM?

And do I need to reshape the embedding output or can I just set

lstm_layer = tf.keras.layers.Bidirectional(
    tf.keras.layers.LSTM(32, input_shape=(300,1), dropout=0.2, recurrent_dropout=0.2)
)

from the embedding output?

can you provide a reproducible example? - AloneTogether