1
votes

I am trying to hyper tune the Support Vector Machine classier to accurately predict classes which have higher degree of overlapping.The objective is to get the precise value of C which would be something like 7.568787 that would separate the classes

The part of the code that deals with this is as follows:

from sklearn.svm import SVC
from scipy.stats import loguniform
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.calibration import CalibratedClassifierCV

parameters = {"C": loguniform(1e-6, 1e+6)}

grid = GridSearchCV(estimator=CalibratedClassifierCV(SVC(kernel = 'rbf', gamma = 'scale', decision_function_shape='ovr', class_weight=None),method='sigmoid', cv=5), param_grid=parameters, refit = True, verbose = 3)

grid.fit(X_train, Y_train)

However, when I try to run the code, I get the following error:

ValueError: Parameter grid for parameter (C) needs to be a list or numpy array, but got (<class 'scipy.stats._distn_infrastructure.rv_frozen'>). Single values need to be wrapped in a list with one element.
1

1 Answers

0
votes

loguniform returns a

<scipy.stats._distn_infrastructure.rv_frozen at 0x1c890037a60>

You could use it like this:

loguniform(1e-6, 1e+6).rvs(#number of samples you want)

.rvs(size): draw random samples from the distribution

EDIT Example with a pipeline

pipeline = make_pipeline(
    CountVectorizer(stop_words="english"),
    TfidfTransformer(),
    DecisionTreeClassifier(),
)

params = {
        'decisiontreeclassifier__max_depth': (1,5,10,50,250,500,1000,5000),
}

gridSearch_decisiontree = GridSearchCV(pipeline, params, cv=3, iid=False, n_jobs=-1)
gridSearch_decisiontree.fit(X, Y)