Say we have used the TFIDF transform to encode documents into continuous-valued features.
How would we now use this as input to a Naive Bayes classifier?
Bernoulli naive-bayes is out, because our features aren't binary anymore.
Seems like we can't use Multinomial naive-bayes either, because the values are continuous rather than categorical.
As an alternative, would it be appropriate to use gaussian naive bayes instead? Are TFIDF vectors likely to hold up well under the gaussian-distribution assumption?
The sci-kit learn documentation for MultionomialNB suggests the following:
The multinomial Naive Bayes classifier is suitable for classification with discrete features (e.g., word counts for text classification). The multinomial distribution normally requires integer feature counts. However, in practice, fractional counts such as tf-idf may also work.
Isn't it fundamentally impossible to use fractional values for MultinomialNB?
As I understand it, the likelihood function itself assumes that we are dealing with discrete-counts (since it deals with counting/factorials)
How would TFIDF values even work with this formula?