0
votes

I have a trained model stored in pickle. All i need to do is get a single-valued dataframe in pandas and get the prediction by passing it to the model.

To handle the categorical columns, i have used one-hot-encoding. So to convert the pandas dataframe to numpy array, i also used one-hot-encoding on the single valued dataframe. But it shows me error.

import pickle
import category_encoders as ce
import pandas as pd

pkl_filename = "pickle_model.pkl"

with open(pkl_filename, 'rb') as file:
    pickle_model = pickle.load(file)

ohe = ce.OneHotEncoder(handle_unknown='ignore', use_cat_names=True)
X_t = pd.read_pickle("case1.pkl")
X_t_ohe = ohe.fit_transform(X_t)
X_t_ohe = X_t_ohe.fillna(0)
Ypredict = pickle_model.predict(X_t_ohe)
print(Ypredict[0])

Traceback (most recent call last): File "Predict.py", line 14, in Ypredict = pickle_model.predict(X_t_ohe) File "/home/neo/anaconda3/lib/python3.6/site-> packages/sklearn/linear_model/base.py", line 289, in predict scores = self.decision_function(X) File "/home/neo/anaconda3/lib/python3.6/site-packages/sklearn/linear_model/base.py", line 270, in decision_function % (X.shape[1], n_features)) ValueError: X has 93 features per sample; expecting 989

1
Can you share your training script as well?Venkatachalam

1 Answers

0
votes

This happens because OneHotEncoder actually converts your dataframe into many different numerical columns and your pickle model actually has the trained model from your original file which does not have the same dimensions(same number of column)

To rectify this issue you will need to retrain your model after applying the one-hot-encoder and then save it as a pickle file and reusing that modelel.