1
votes

I am using keras 2.0.2 to create a lstm network for a classification task. The network topology is as below:

from numpy.random import seed
seed(42)
from tensorflow import set_random_seed
set_random_seed(42)
import os
#os.environ['PYTHONHASHSEED'] = '0'

model = Sequential()
model.add(embedding_layer)    
model.add(LSTM(units=100)) #line A
model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

On the same dataset, with the current line A, I obtain a result as:

    precision    recall  f1  support
0   0.68    0.58    0.63    305
2   0.92    0.95    0.93    1520
avg 0.8 0.76    0.78    1825

, where 0, 2 indicate class labels

But when line A is changed to

model.add(LSTM(100))

I obtained different results:

    precision    recall  f1  support
0   0.66    0.58    0.62    305
2   0.92    0.94    0.93    1520
avg 0.79    0.76    0.77    1825

This does not make sense to me as according to the keras documentation https://keras.io/layers/recurrent/#lstm, I thought that the two lines should be same? Have I misunderstood anything?

1

1 Answers

5
votes

model.add(LSTM(100)) and model.add(LSTM(units=100)) are equivalent. What may cause the difference between the results is the randomness in the process. In order to avoid this randomness and get reproducible results, you should specify the seed in the beginning of your code.

For theano backend, add

from numpy.random import seed
seed(1)

to the top of your code.

For tensorflow backend, add

from numpy.random import seed
seed(1)
from tensorflow import set_random_seed
set_random_seed(2)

to the top of your code.

The above code is taken from this blog post.