I have a X_train set of 744983 samples divided into 24443 sequences, while the number of samples in each sequence is different. Each sample is a vector of 30 dimensions. How can I feed these data into a LSTM of Keras? Here is some description of the train set :
print(type(X_train))
print(np.shape(X_train))
print(type(X_train[0]))
print(np.shape(X_train[0]))
<class 'list'>
(24443, )
<class 'numpy.ndarray'>
(46, 30)
When I set the parameters in this way :
model = Sequential()
model.add(LSTM(4, input_shape = (30, ), return_sequences=True,))
model.add(Dense(1))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')
model.fit(X_train, y_train, epochs=1, batch_size=1, verbose=2`)
The error is "Input 0 is incompatible with layer lstm_24: expected ndim=3, found ndim=2"
If I change input_shape from (30, ) to (None, 30), the code runs for 1 minute the give the error 'Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 arrays but instead got the following list of 24443 arrays'
Furthermore, if I change X_train into nparrays before fitting, the error turns to : expected lstm_26_input to have 3 dimensions, but got array with shape (24443, 1)
I also tried to pad the sequences :
X_train = sequence.pad_sequences(X_train)
X_test = sequence.pad_sequences(X_test)
However it turned my inputs to '0', '1', '-1' everywhere..
#X_train = np.array(X_train)
#X_test = np.array(X_test)
print(X_train[0])
[[ 0 0 0 ..., 0 0 0]
[ 0 0 0 ..., 0 0 0]
[ 0 0 0 ..., 0 0 0]
...,
[ 0 0 0 ..., 0 1 -1]
[ 0 0 0 ..., 0 1 0]
[ 0 0 0 ..., 0 0 0]]
X_train.shape
? I find the question a bit confusing, if the problem is with input shapes you should post the inputs shapes, not their content (from which we have no way of retrieving the shape) – gionni