0
votes

I'm splitting my data set into training, validation and test data and then perform gridsearch-crossvalidation on the training data. Is that enough as cross validation method, or do I need to implement k-fold cross validation after my GridSearchCV? I'm a bit confused as I'm thinking the gridsearch only helps me find the optimal hyper parameters.

1
Well, if you are splitting your data into train, dev and test why are you using CV? I think you should use your dev set to fit the hyperparameters of your estimator. The idea of CV is when you have not enough data to be able to split into train and test or train, dev and test.OSainz

1 Answers

0
votes

Grid search is an exhaustive search for hyperparameters of the model but also has cross validation included, if you want to use it. It takes more time to run a grid search with CV as it is an exhaustive method. Grid search in sklearn has an option for cross validation cv. Please refer to the official document for more info on this https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

You can also provide a KFold class to your gridsearch like this

validation = KFold(n_splits=10, shuffle=True)
clf = GridSearchCV(..., cv=validation)
clf.fit(X,y)

You can also run a combination of CV and gridsearch i.e. gridsearch on every cross validation iteration though that will be very computation intensive.