Why can't I save my trained RandomForestRegressor model?

Question

I have a trained RandomForestRegressor model I would like to save to a file for re-use. I'm following the instructions on the scikit-learn persistence page, and can save the trained model. The problem is that I cannot seem to re-use it because scikit-learn does not recognize it as trained.

model = RandomForestRegressor(n_estimators=100, max_features='sqrt',    max_depth=12, n_jobs=24)
model.fit(training_input,training_target_values)
joblib.dump(model, './trained_model/tree.pkl')

But when I try to re-use the model:

model = joblib.load('./trained_model/tree.pkl') 
prediction = np.array(model.predict(patient_arr))

I get the error:

File "/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/forest.py", line 614, in predict check_is_fitted(self, 'n_outputs_') File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 627, in check_is_fitted raise NotFittedError(msg % {'name': type(estimator).name}) sklearn.utils.validation.NotFittedError: This RandomForestRegressor instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.

I've also tried:

trained_model = model.fit(training_input,training_target_values)
joblib.dump(trained_model, './trained_model/tree.pkl')

with the same results.

The point of the post was that the model fit without error, but that the saved model could not be run without the fit error listed above. The error says explicitly that the model was not fit when it's clear by the code already posted that it was in fact fit. The relevant code is already posted. — user2188329

MingYong Liu MingYong Liu · Accepted Answer · 2016-01-22T09:16:29

Maybe the file extension is wrong, i tried to use joblib.dump(rfClf, "models/train_model_6.m") to save the model and it works, you can have a try.

Why can't I save my trained RandomForestRegressor model?

1 Answers