Illegal Argument Exception using Random Forest in PySpark mllib

Question

I am using Random Forest algorithm for classification in Spark MLlib using PySpark. My codes are as follows:\

model = RandomForest.trainClassifier(trnData, numClasses=3, categoricalFeaturesInfo={}, numTrees=3, featureSubsetStrategy="auto", impurity='gini', maxDepth=4, maxBins=32)

predictions = model.predict(tst_dataRDD.map(lambda x: x.features))

labelsAndPredictions = tst_dataRDD.map(lambda lp: lp.label).zip(predictions)

testErr = labelsAndPredictions.filter(lambda x: x[0] != x[1]).count() / float(tst_dataRDD.count())

I got IllegalArgumentException: GiniAggregator given label -0.0625but requires label to be non-negative.
How can I solve this problem? Thanks

Hossein Torabi Hossein Torabi · Accepted Answer · 2020-06-28T20:32:50

Please use RandomForestClassifier instead and see the docs: https://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest-classifier

Illegal Argument Exception using Random Forest in PySpark mllib

2 Answers