I split my dataset into training and testing. At the end after finding the best hyper parameters for the training dataset, should I fit the model again using all the data? The point is to reach the highest possible score for new data.
0
votes
2 Answers
0
votes
0
votes
I don't think so. If you do that, you will no longer have a valid test set. What happens when you come back to improve the model later? If you do this, then you will need a new test set each model improvement, which means more labeling. You won't be able to compare experiments across model versions, because the test set won't be identical.
If you consider this model finished forever, then ok.