Reading doc for k fold cross validation http://scikit-learn.org/stable/modules/cross_validation.html I'm attempting to understand the training procedure for each of the folds.
Is this correct :
In generating the cross_val_score
each fold contains a new training and test set , these training and test sets are utilized by the passed in classifier clf
in below code for evaluating each fold performance ?
This implies that increasing size of fold can affect accuracy depending on size of training set as increase number of folds reduces training data available for each fold ?
From doc cross_val_score
is generated using :
from sklearn.model_selection import cross_val_score
clf = svm.SVC(kernel='linear', C=1)
scores = cross_val_score(clf, iris.data, iris.target, cv=5)
scores
array([ 0.96..., 1. ..., 0.96..., 0.96..., 1. ])