How do I test a text classification problem with unknown words? In training a model, we can use smoothing technique (Laplace add-1) to make sure any word will receive at least 1 count for each class.
Then, what about at testing stage? If a word doesn't occur in the training data, what's the best way to deal with it? Simply skip it, or also give an occurrence of 1 to it?
Thanks, for any suggestions or opinions. Specifically, I am using a Naive Bayes classifier.