You can create a custom CV iterator, for instance by taking inspiration on LeaveOneGroupOut or LeaveOneGroupOut to implement the structure you are interested in.
Alternatively you can prepare your own precomputed folds encoded as an array of integers (representing sample indices between 0
and n_samples - 1
) and then pass that CV iterator as the cv
argument of the cross_val_score
and GridSearchCV
utilities:
>>> X, y = make_classification(n_samples=10)
>>> import numpy as np
>>> from sklearn.datasets import make_classification
>>> from sklearn.linear_model import LogisticRegression
>>> from sklearn.model_selection import cross_val_score
>>> cv_splits = [
... (np.array([0, 1, 2, 3]), np.array([4, 5, 6])),
... (np.array([1, 2, 3, 4]), np.array([5, 6, 7])),
... (np.array([5, 6, 8, 9]), np.array([1, 2, 3, 4])),
... ]
>>> cross_val_score(LogisticRegression(), X, y, cv=cv_splits)
array([1. , 0.33333333, 0.75 ])