I keep on reading that Naive Bayes needs fewer features than many other ML algorithms. But what's the minimum number of features you actually need to get good results (90% accuracy) with a Naive Bayes model? I know there is no objective answer to this -- it depends on your exact features and what in particular you are trying to learn -- but I'm looking for a numerical ballpark answer to this.
I'm asking because I have a dataset with around 280 features and want to understand if this is way too few features to use with Naive Bayes. (I tried running Naive Bayes on my dataset and although I got 86% accuracy, I cannot trust this number as my data is imbalanced and I believe this may be responsible for the high accuracy. I am currently trying to fix this problem.)
In case it's relevant: the exact problem I'm working on is generating time tags for Wikipedia articles. Many times the infobox of a Wikipedia article contains a date. However, many times this date appears in the text of the article but is missing from the infobox. I want to use Naive Bayes to identify which date from all the dates we find in the article's text we should place in the infobox. Every time I find a sentence with a date in it I turn it into a feature vector -- listing what number paragraph I found this in, how many times this particular date appears in the article, etc. I've limited myself to a small subset of Wikipedia articles -- just apple articles -- and as a result, I only have 280 or so features. Any idea if this is enough data?
Thanks!