2
votes

I am a new beginner in Apache Spark. I trained a LogisticRegression model using crossValidation. For instance:

val cv = new CrossValidator() .setEstimator(pipeline) .setEvaluator(new BinaryClassificationEvaluator) .setEstimatorParamMaps(paramGrid) .setNumFolds(5) val cvModel = cv.fit(data)

I was able to train and test my model without any error. Then I saved the model and the pipeline using:

cvModel.save("/path-to-my-model/spark-log-reg-transfer-model") pipeline.save("/path-to-my-pipeline/spark-log-reg-transfer-pipeline")

Up till this stage, the operations worked perfect. Then later on, I tried to load my model back for prediction on new data points, then the following error occured:

val sameModel = PipelineModel.load("/path-to-my-model/spark-log-reg-transfer-model")

java.lang.IllegalArgumentException: requirement failed: Error loading metadata: Expected class name org.apache.spark.ml.PipelineModel but found class name org.apache.spark.ml.tuning.CrossValidatorModel

Any idea what I may have done wrong? Thanks.

2

2 Answers

4
votes

You are trying to load CrossValidator with PipelineModel object. You should use correct loaders...

val crossValidator = CrossValidator.load("/path-to-my-model/spark-log-reg-transfer-model")

val sameModel = PipelineModel.load("/path-to-my-pipeline/spark-log-reg-transfer-pipeline")
1
votes

To load a Cross Validator it should be

val crossValidator = CrossValidator.load("/path-to-my-model/spark-log-reg-transfer-model")

To load a Cross Validator Model use (Note: A Cross Validator becomes a Cross Validator model when you call fit() on CrossValidator)

val crossValidatorModel = CrossValidatorModel.load("/path-to-my-model/spark-log-reg-transfer-model")

Since you are trying to load a model, CrossValidatorModel.load would be the correct one.