I am doing a classification task using libsvm. I have a 10 fold cross validation where the F1 score is 0.80. However, when I split the training dataset into two (one is for training and the other is for testing, which I call it holdout test set) the F1 score drops to 0.65. The split is in .8 to .2 ratio.
So, my question is that is there any significant difference in doing k-fold cross validation vs. holdout test? Which of these two techniques will produce a model that generalizes well? In both cases, my dataset is scaled.