I'm using grid search to fit machine learning model parameters.
I typed in the following code (modified from the sklearn documentation page: http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html)
from sklearn import svm, grid_search, datasets, cross_validation
# getting data
iris = datasets.load_iris()
# grid of parameters
parameters = {'kernel':('linear', 'poly'), 'C':[1, 10]}
# predictive model (support vector machine)
svr = svm.SVC()
# cross validation procedure
mycv = cross_validation.StratifiedKFold(iris.target, n_folds = 2)
# grid search engine
clf = grid_search.GridSearchCV(svr, parameters, mycv)
# fitting engine
clf.fit(iris.data, iris.target)
However, when I look at clf.estimator
, I get the following:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
kernel='rbf', max_iter=-1, probability=False, random_state=None,
shrinking=True, tol=0.001, verbose=False)
How did I end up with a 'rbf' kernel? I didn't specify it as an option in my parameters.
What's going on?
Thanks!
P.S. I'm using '0.15-git' version for sklearn.
Addendum: I noticed that clf.best_estimator_
gives the right output. So what is clf.estimator
doing?
kernel
key should have a list as its values. i.e.['linear', 'poly']
(square brackets).rbf
just showed up because it is the default. - gobrewers14estimator
is an object of theGridSearchCV
class. If you create an instance of this class, i.e.clf
,.estimator
will return the object and in this case, since your initial code was erroneous, it returned the default. - gobrewers14