0
votes

I am using KNearestNeighbors from sklearn to perform some learning. I split my dataset into training(70%) and testing (30%) from a dataset with 30,000 observations. However, I am unable to understand why 2 methods of evaluating the same model would yield such different results.

More specifically, when I take the r^2 value of the testing set all at once I get a much higher score (~0.70) than when I do kFold cross validation on the testing set. Why are these scores so different when the exact same model is being tested on exactly the same data. I am sure I am doing something wrong but I have no clue what. Please help!

r2_scorer = make_scorer(r2_score)

clf = neighbors.KNeighborsRegressor()
clf = clf.fit(X_train,y_train)
score1 = r2_score(y_test,clf.predict(X_test))

> 0.68777300248206585

kfold = model_selection.KFold(n_splits=10, random_state=42)
scores2 = cross_val_score(clf,X_test,y_test,cv = kfold, scoring = r2_scorer)

scores2
>array([ 0.05111285,  0.65697228,  0.57468009,  0.6706573 ,  0.46720042,
        0.3763054 ,  0.56881947,  0.32569462, -0.16607888, -0.6862521 ])

scores2.mean()
> 0.28391114469744039

scores2.std()
> 0.4118551721575503
1
Cross validation doesn't use the same model, it's making new ones each fold for its validation. - jacoblaw
Ah, I finally see my error. Thank you very much - ata
@ata I posted an answer explaining tour results and answering your main question - seralouk

1 Answers

0
votes

When you use the cross validation function:

scores2 = cross_val_score(clf,X_test,y_test,cv = kfold, scoring = r2_scorer)

You generate 10 folds and you for each fold you get the r2 score.

So the results:

scores2
>array([ 0.05111285,  0.65697228,  0.57468009,  0.6706573 ,  0.46720042,
         0.3763054 ,  0.56881947,  0.32569462, -0.16607888, -0.6862521 ])

as you can see include 10 values. Each value corresponds to each fold.

Bottom line:

It is normal to get different r2 scores for each fold since the splitting of the data is not exactly the same for each fold.