0
votes

I am learning machine learning from a book Artificial-Intelligence-with-Python-Second-Edition. I faced such error:

ValueError: too many values to unpack (expected 3)

Here is the code from the book:

 print("\nGrid scores for the parameter grid:")
 for params, avg_score, _ in classifier.grid_scores_: # from sklearn import grid_search 
    print(params, '-->', round(avg_score, 3))

(The code for the tutorial was taken from the GitHub: Artificial-Intelligence-with-Python-Second-Edition/Chapter06/run_grid_search.py )

From sklearn import grid_search - this library is no longer used, I need to change it to cv_results_. but when I'm using this attributes cv_results_, I get this error:

ValueError: too many values to unpack (expected 3)

I have tried different variants and also re-read all the help on this topic and I cannot find a solution yet.

My full code:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.model_selection import train_test_split

from utilities import visualize_classifier

# Load input data
input_file = 'data_random_forests.txt'
data = np.loadtxt(input_file, delimiter=',')
X, y = data[:, :-1], data[:, -1]

# Separate input data into three classes based on labels
class_0 = np.array(X[y==0])
class_1 = np.array(X[y==1])
class_2 = np.array(X[y==2])

# Split the data into training and testing datasets 
X_train, X_test, y_train, y_test = train_test_split.train_test_split(
     X, y, test_size=0.25, random_state=5)

# Define the parameter grid 
parameter_grid = [ {'n_estimators': [100], 'max_depth': [2, 4, 7, 12, 16]},
               {'max_depth': [4], 'n_estimators': [25, 50, 100, 250]}
             ]

metrics = ['precision_weighted', 'recall_weighted']

for metric in metrics:
    print("\n##### Searching optimal parameters for", metric)

classifier = grid_search.GridSearchCV(
        ExtraTreesClassifier(random_state=0), 
        parameter_grid, cv=5, scoring=metric)
classifier.fit(X_train, y_train)

print("\nGrid scores for the parameter grid:")
for params, avg_score, _ in classifier.cv_results_:
    print(params, '-->', round(avg_score, 3))

print("\nBest parameters:", classifier.best_params_)

y_pred = classifier.predict(X_test)
print("\nPerformance report:\n")
print(classification_report(y_test, y_pred))
1
You are trying to assign each element in classifier.cv_results_ to (params, avg_score, _), but each element in cv_results_ has more than 3 components, hence the error. To use grid_search.GridSearchCV you need to look at the documentation and figure out how to get the params and scores a different way. - Elliot Way

1 Answers

0
votes

GridSearchCV.cv_results_ is a dictionary of numpy ndarrays (source). You are trying to cast 1 dictionary into 3 variables (params, avg_score and _). It probably worked in the past since grid_search.cv_results_ returned 3 objects, while current GridSearchCV.cv_results_ returns one dictionary.
It's very straight forward to convert the dictionary into a Pandas DataFrame.

import pandas as pd
df = pd.DataFrame(classifier.cv_results_)

You are interested in printing only the parameters and the scores, so let's do that by selecting the columns which have 'param' or 'score' in their names:

df_columns_to_print = [column for column in df.columns if 'param' in column or 'score' in column]
print(df[df_columns_to_print])