I have a problem trying to compute precision, recall and FMeasure as part of the metrics for evaluating an LSTM text classifier implemented in Keras on Tensorflow. I'm aware that these functions were removed from Keras 2.02
metrics module.
# create the model
embedding_vector_length = 32
model = Sequential()
# load the dataset with word embedding but only keep the top n words, zero the rest
model.add(Embedding(top_words, embedding_vector_length, input_length=max_tweet_length))
model.add(LSTM(100))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, epochs=3, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
print(scores)
# print the classification report
from sklearn.metrics import classification_report
predicted = model.predict(X_test)
report = classification_report(y_test, predicted)
print(report)
As an alternative I'm parsing the fitted model and predicted output as object into sklearn.metrics.classification_report
however I keep getting the errors about the data types of the targets. The predicted output is of the float32
format since I'm using the Sigmoid activation function, while the labels is a collection of text with binary levels of classification. I get the accuracy evaluation from the Keras metrics but the precison, recall, fmeasure evaluation is the problem.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/root/anaconda3/envs/py35/lib/python3.5/site-packages/sklearn/metrics/classification.py", line 1261, in precision_score
sample_weight=sample_weight)
File "/root/anaconda3/envs/py35/lib/python3.5/site-packages/sklearn/metrics/classification.py", line 1025, in precision_recall_fscore_support
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
File "/root/anaconda3/envs/py35/lib/python3.5/site-packages/sklearn/metrics/classification.py", line 81, in _check_targets
"and {1} targets".format(type_true, type_pred))
ValueError: Classification metrics can't handle a mix of binary and continuous targets
X_train
,X_test
,y_train
, andy_test
? It seems you might have a bunch of 0s and 1s and potentially some extraneous elements, too. – blacksite