1
votes

This question has been asked several times before. But I get an error when following the answer

First I specify which part is the training set and the validation set as follows.

my_test_fold = []


for i in range(len(train_x)):
    my_test_fold.append(-1)

 for i in range(len(test_x)):
    my_test_fold.append(0)

And then gridsearch is performed.

from sklearn.model_selection import PredefinedSplit
param = {
 'n_estimators':[200],
 'max_depth':[5],
 'min_child_weight':[3],
 'reg_alpha':[6],
    'gamma':[0.6],
    'scale_neg_weight':[1],
    'learning_rate':[0.09]
}




gsearch1 = GridSearchCV(estimator = XGBClassifier( 
    objective= 'reg:linear', 
    seed=1), 
param_grid = param, 
scoring='roc_auc',
cv = PredefinedSplit(test_fold=my_test_fold),
verbose = 1)


gsearch1.fit(new_data_df, df_y)

But I get the following error

 object of type 'PredefinedSplit' has no len()
2

2 Answers

1
votes

Try to replace

cv = PredefinedSplit(test_fold=my_test_fold)

with

cv = list(PredefinedSplit(test_fold=my_test_fold).split(new_data_df, df_y))

The reason is that you may need to apply the split method to actually get the split into training and testing (and then transform it from an iterable object to a list object).

-2
votes

The hypopt Python package (pip install hypopt), for which I am an author, was created for this exact purpose: parameter optimization with a validation set. It works with scikit-learn models and can be used with Tensorflow, PyTorch, Caffe2, etc.

# Code from https://github.com/cgnorthcutt/hypopt
# Assuming you already have train, test, val sets and a model.
from hypopt import GridSearch
param_grid = [
  {'C': [1, 10, 100], 'kernel': ['linear']},
  {'C': [1, 10, 100], 'gamma': [0.001, 0.0001], 'kernel': ['rbf']},
 ]
# Grid-search all parameter combinations using a validation set.
opt = GridSearch(model = SVR(), param_grid = param_grid)
opt.fit(X_train, y_train, X_val, y_val)
print('Test Score for Optimized Parameters:', opt.score(X_test, y_test))

Edit: Has something changed with hypopt to cause the sudden recent downvotes? Some feedback would help as hypopt solves this exact problem and if there is an issue, we should fix it.