2
votes

I know that the Naive Bayes is good at binary classification, but I wanted to know how does the Multiclass classification works.

For example: I did a text classification using Naive Bayes earlier in which I performed vectorization of text to find the probability of each word in the document, and later used the vectorized data to fit naive bayes classifier.

Now, I am working with the data which looks like:

A, B, C, D, E, F, G

210, 203, 0, 30, 710, 2587452, 0

273, 250, 0, 30, 725, 3548798, 1

283, 298, 0, 31, 785, 3987452, 3

In the above data, there are 6 features (A-F) and G is the class having value(0,1 or 2)

I have almost 70000 entries in dataset having the class (output) 1, 2, or 3.

After splitting the data into test and training data, I am fitting the training data into sklearn- GaussianNB algo. After fitting when I try to predict testing data it just classify either 0 or 2.

So, my question is as I performed vectorization before fitting the navie bayes classifier during text classification, is there and pre-processing of data I need to do for the above data before fitting the GaussianNB classifier with training data, so that it can predict multi-class(0,1 and 2) instead of only (0 and 2).

1
if you get predictions only for the 2 classes, then try to normalize the features before performing the fitting of the model. it seems that some features are in higher numeric range and this affects the training (dominance of some features over other)seralouk

1 Answers

3
votes

I know that the Naive Bayes is good at binary classification, but I wanted to know how does the Multiclass classification works.

There is nothing in Naive Bayes specific to binary classification ,it is designed to do multiclass classification just fine.

So, my question is as I performed vectorization before fitting the navie bayes classifier during text classification, is there and pre-processing of data I need to do for the above data before fitting the GaussianNB classifier with training data, so that it can predict multi-class(0,1 and 2) instead of only (0 and 2).

No, there is no preprocessing, for the multiclass bit. However, for Gaussian bit - as name suggests this model will try to fit Gaussian pdf to each feature. Consequently if your features do not follow Gaussian distribution - it can fail. If you can figure out transformation of each feature (based on the data you have) to make them more Gaussian-like, it will help the model. For example, some of your features seem to be huge numbers, which can cause serious difficulties if they do not follow gaussian distribution. You might want to normalise your data, or even drop these features.

The only reason why your model never predicts 1 is because under Naive Bayes assumptions, and with data provided - it is not probable enough to be ever considered. You can try normalising features as described above. If this fails, you can also artificially "overweight" selected classes by providing your own prior attribute to sklearn (which is normally estimated from data as "how often sample with class X is encountered", and if you change this to higher number - a class will be considered more probable).