LSTM not learning, no MSE

Question

Hi I am having trouble with finding the correct inputshape for my LTSM model. I have been trying to find a shape that fits but have trouble understanding what is required.

I think the problem is in the ytest and ytrain shape. Why is it not the same shape as xtrain and xtest?

xtrain (80304, 37)
xtest (39538, 37)
ytrain (80304,)
ytest (39538,)
Epoch 1/3
2510/2510 [==============================] - 34s 13ms/step - loss: nan
Epoch 2/3
2510/2510 [==============================] - 32s 13ms/step - loss: nan
Epoch 3/3
2510/2510 [==============================] - 33s 13ms/step - loss: nan
Model: "sequential_9"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_10 (LSTM)               (None, 4)                 96        
_________________________________________________________________
dense_9 (Dense)              (None, 1)                 5         
=================================================================
Total params: 101
Trainable params: 101
Non-trainable params: 0

The model isnt training based on MSE:

When I try to fit this model:

print('tf version', tf.version.VERSION)

train_size = int(len(oral_ds) * 0.67)
print(train_size)
test_size = len(oral_ds) - train_size
print(test_size)
train = oral_ds[:train_size]
test = oral_ds[80319:119881]

print(len(train), len(test))

X_train = train.drop(columns=['PRICE','WEEK_END_DATE','Optimized rev','Original rev'])
y_train = train.PRICE

X_test = test.drop(columns=['PRICE','WEEK_END_DATE','Optimized rev','Original rev'])
y_test = test.PRICE

print('xtrain',np.shape(X_train))
print('xtest',np.shape(X_test))
print('ytrain',np.shape(y_train))
print('ytest',np.shape(y_test))


X_train=X_train.values.reshape(X_train.shape[0],X_train.shape[1],1)
#y_train=y_train.values.reshape(y_train.shape[0],y_train.shape[1],1)
X_test=X_test.values.reshape(X_test.shape[0],X_test.shape[1],1)
#y_test=y_test.values.reshape(y_test.shape[0],y_test.shape[1],1)

#print('reshaped xtrain',np.shape(X_train))
#print('reshaped xtest',np.shape(X_test))
#print('reshaped ytrain',np.shape(y_train))
#print('reshaped ytest',np.shape(y_test))


single_step_model = tf.keras.models.Sequential()
single_step_model.add(tf.keras.layers.LSTM(4,
                                            input_shape=(37,1)))
single_step_model.add(tf.keras.layers.Dense(units = 1))
single_step_model.compile(optimizer = 'adam', loss = 'mean_squared_error')



BATCH_SIZE=32
train_data = tf.data.Dataset.from_tensor_slices((X_train, y_train))
train_data = train_data.cache().shuffle(10000).batch(BATCH_SIZE)
valid_data = tf.data.Dataset.from_tensor_slices((X_test, y_test))

history = single_step_model.fit(train_data, epochs=3)
single_step_model.summary()

I have tried to implement solutions from other posts such as:

But neither of these are working.

Anyway any guidance?

Follow this link for your problem. Clearly explained here. stackoverflow.com/a/54416792/12598386 — Lakpa Tamang
Thank you Tamang, I have tried this and it doesn't seem to be working. I have updated my code to show the same error message — AnnejetLouise

berkay berkay · Accepted Answer · 2021-01-13T10:19:49

So in general LSTM expects 3 dimensonal inputs:

 (#batch_size, #number_of_features, #timesteps)

for which the feature and timesteps indices change depending on the platform. I guess you have 37 timesteps and 1 feature, so just change your input to:

 (#batch_size,37,1) or (#batch_size,1,37)

So see the following dummy example:

import tensorflow as tf
inputs = tf.random.normal([100, 37, 1])
lstm = tf.keras.layers.LSTM(units =50 ,input_shape=(37,1))
output = lstm(inputs)
>>print(output.shape)
(100, 50)

The code in the following works end-to-end:

X_train = np.random.rand(80319,37,1)
y_train = np.random.randint(0,1,80319)
BATCH_SIZE=32
train_data = tf.data.Dataset.from_tensor_slices((X_train, y_train))
train_data = train_data.cache().shuffle(10000).batch(BATCH_SIZE)

    
single_step_model = tf.keras.models.Sequential()
single_step_model.add(tf.keras.layers.LSTM(4,
                                            input_shape=(37,1)))
single_step_model.add(tf.keras.layers.Dense(units = 1))

single_step_model.compile(optimizer = 'adam', loss = 'mean_squared_error')
single_step_model.fit(train_data, epochs=10)

and if you remove the batch size selection it throws the exactly same error.

LSTM not learning, no MSE

1 Answers