from pyspark.ml.regression import RandomForestRegressionModel
rf = RandomForestRegressor(labelCol="label",featuresCol="features", numTrees=5, maxDepth=10, seed=42)
rf_model = rf.fit(train_df)
rf_model_path = "./hdfsData/" + "rfr_model"
rf_model.save(rf_model_path)
When I first tried to save the model, these lines worked. But when I want to save the model into the path again, it gave this error:
Py4JJavaError: An error occurred while calling o1695.save. : java.io.IOException: Path ./hdfsData/rfr_model already exists. Please use write.overwrite().save(path) to overwrite it.
Then I tried:
rf_model.write.overwrite().save(rf_model_path)
It gave:
AttributeError: 'function' object has no attribute 'overwrite'
It seems the pyspark.mllib
module gives the overwrite function but not pyspark.ml
module. Anyone knows how to resolve this if I want to overwrite the old model with the new model? Thanks.