Is it possible to find the error metrics(precision and recall) in a multiclass classification problem in Apache Spark. I am using Logistic Regression from Spark's MlLib to build my model and want to evaluate my model using the error metrics.
1 Answers
0
votes
From MLlib docs
Assuming your test data is in test
import org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS import org.apache.spark.mllib.evaluation.MulticlassMetrics import org.apache.spark.mllib.regression.LabeledPoint import org.apache.spark.mllib.util.MLUtils val predictionAndLabels = test.map { case LabeledPoint(label, features) => val prediction = model.predict(features) (prediction, label) } val metrics = new MulticlassMetrics(predictionAndLabels)
Confusion matrix
println("Confusion matrix:") println(metrics.confusionMatrix)
Overall Statistics
val accuracy = metrics.accuracy println("Summary Statistics") println(s"Accuracy = $accuracy")
Precision by label
val labels = metrics.labels labels.foreach { l => println(s"Precision($l) = " + metrics.precision(l)) }
Recall by label
labels.foreach { l => println(s"Recall($l) = " + metrics.recall(l)) }
False positive rate by label
labels.foreach { l => println(s"FPR($l) = " + metrics.falsePositiveRate(l)) }
F-measure by label
labels.foreach { l => println(s"F1-Score($l) = " + metrics.fMeasure(l)) }
Weighted stats
println(s"Weighted precision: ${metrics.weightedPrecision}") println(s"Weighted recall: ${metrics.weightedRecall}") println(s"Weighted F1 score: ${metrics.weightedFMeasure}") println(s"Weighted false positive rate: ${metrics.weightedFalsePositiveRate}")