I am trying to create a LogisticRegression model (LogisticRegressionWithSGD), but its getting an error of
org.apache.spark.SparkException: Input validation failed.
If I give it binary input (0,1 instead of 0,1,2) it does succeed.
example input:
parsed_data = [LabeledPoint(0.0, [4.6,3.6,1.0,0.2]),
LabeledPoint(0.0, [5.7,4.4,1.5,0.4]),
LabeledPoint(1.0, [6.7,3.1,4.4,1.4]),
LabeledPoint(0.0, [4.8,3.4,1.6,0.2]),
LabeledPoint(2.0, [4.4,3.2,1.3,0.2])]
Code:
model = LogisticRegressionWithSGD.train(parsed_data)
Is the Logistic Regression model in spark supposed to be for binary classification only?