I am using Random Forest algorithm for classification in Spark MLlib using PySpark. My codes are as follows:\
model = RandomForest.trainClassifier(trnData, numClasses=3, categoricalFeaturesInfo={}, numTrees=3, featureSubsetStrategy="auto", impurity='gini', maxDepth=4, maxBins=32)
predictions = model.predict(tst_dataRDD.map(lambda x: x.features))
labelsAndPredictions = tst_dataRDD.map(lambda lp: lp.label).zip(predictions)
testErr = labelsAndPredictions.filter(lambda x: x[0] != x[1]).count() / float(tst_dataRDD.count())
I got IllegalArgumentException: GiniAggregator given label -0.0625but requires label to be non-negative.
How can I solve this problem? Thanks