Suppose I split my data into training set and validation set. I perform a 5-fold cross-validation on my training set to obtain the optimal hyper-parameters for my model, then I use the optimal hyper-parameters to train my model and apply the resulting model on my validation set. My question is, is it reasonable to combine the training and validation set, and use the hyper-parameters obtained from the training set to build a final model?
1 Answers
It is resonable if training data was relatively small and adding validation set makes your model significantly stronger. However, at the same time, adding new data makes your previously selected hyperparameters possibly suboptimal (it is really hard to show what kind of transformation of hyperparameters you should apply when you add new data to your training set). Thus you balance two things - gain in model quality from more data and possible loss due to hard to predict change in hyperparameters meaning. To some extent you can simulate this process to make sure it makes sense, if you have N points in training data and M in validation, you can try to split training further to chunks with the same proportion (thus one is now N * (N/(N+M) and other N * (M/(N+M))), train on first one and check whether optimal hyperparameters transfer (more or less) to the optimal one on the whole training set - if so, you can safely add validation as they should transfer as well. If they do not - the risk might be not worth the gain.