Holdout vs. K fold cross validation in libsvm

Question

I am doing a classification task using libsvm. I have a 10 fold cross validation where the F1 score is 0.80. However, when I split the training dataset into two (one is for training and the other is for testing, which I call it holdout test set) the F1 score drops to 0.65. The split is in .8 to .2 ratio.

So, my question is that is there any significant difference in doing k-fold cross validation vs. holdout test? Which of these two techniques will produce a model that generalizes well? In both cases, my dataset is scaled.

lejlot lejlot · Accepted Answer · 2015-12-31T17:12:56

There are huge differences, however the exact analysis requires much of statistics. For deep understanding, refer to The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Hastie, Tibshirani and Friedman.

In short :

single train-test split is unreliable measure of model quality (unless you have very large dataset)
repeated train-test splits converge to the true score given that the training set is representatible for the underlying distribution, however in practise they are often overoptimistic
CV tends to give lower scores of model quality as compared to train-test splits and gives you reasonable answers much faster, however at the cost of higher computational complexity.
If you have large set of data (>50 000 samples) then train-test split might be enough
If you have enough time, CV is nearly always a better (less optimistic) way to measure classifier quality
There are more methods than just these two, you might also want to look at methods from err0.632 family (bootstrap)

Holdout vs. K fold cross validation in libsvm

2 Answers