After building the Classification model, I evaluated it by means of accuracy, precision and recall. To check over fitting I used K Fold Cross Validation. I am aware that if my model scores vary greatly from my cross validation scores then my model is over fitting. However, am stuck with how to define the threshold. Like how much difference in the scores will actually infer that the model is over fitting. For example, here are 3 splits (3 Fold CV, shuffle= True, random_state= 42) and their respective scores upon a Logistic Regression model:
Split Number 1
Accuracy= 0.9454545454545454
Precision= 0.94375
Recall= 1.0
Split Number 2
Accuracy= 0.9757575757575757
Precision= 0.9753086419753086
Recall= 1.0
Split Number 3
Accuracy= 0.9695121951219512
Precision= 0.9691358024691358
Recall= 1.0
Direct training of the Logistic Regression model without CV:
Accuracy= 0.9530201342281879
Precision= 0.952054794520548
Recall= 1.0
So how do I decide by what magnitude my scores need to vary in order to infer an over fitting case?