Loading a trained crossValidation model in Spark

Question

I am a new beginner in Apache Spark. I trained a LogisticRegression model using crossValidation. For instance:

val cv = new CrossValidator() .setEstimator(pipeline) .setEvaluator(new BinaryClassificationEvaluator) .setEstimatorParamMaps(paramGrid) .setNumFolds(5) val cvModel = cv.fit(data)

I was able to train and test my model without any error. Then I saved the model and the pipeline using:

cvModel.save("/path-to-my-model/spark-log-reg-transfer-model") pipeline.save("/path-to-my-pipeline/spark-log-reg-transfer-pipeline")

Up till this stage, the operations worked perfect. Then later on, I tried to load my model back for prediction on new data points, then the following error occured:

val sameModel = PipelineModel.load("/path-to-my-model/spark-log-reg-transfer-model")

java.lang.IllegalArgumentException: requirement failed: Error loading metadata: Expected class name org.apache.spark.ml.PipelineModel but found class name org.apache.spark.ml.tuning.CrossValidatorModel

Any idea what I may have done wrong? Thanks.

kkurt kkurt · Accepted Answer · 2016-08-13T22:15:48

You are trying to load CrossValidator with PipelineModel object. You should use correct loaders...

val crossValidator = CrossValidator.load("/path-to-my-model/spark-log-reg-transfer-model")

val sameModel = PipelineModel.load("/path-to-my-pipeline/spark-log-reg-transfer-pipeline")

Loading a trained crossValidation model in Spark

2 Answers