scikits learn and nltk: Naive Bayes classifier performance highly different

Question

I am comparing two Naive Bayes classifiers: one from NLTK and and one from scikit-learn. I'm dealing with a multi-class classification problem (3 classes: positive (1), negative (-1), and neutral (0)).

Without performing any feature selection (that is, using all features available), and using a training dataset of 70,000 instances (noisy-labeled, with an instance distribution of 17% positive, 4% negative and 78% neutral), I train two classifiers, the first one is a nltk.NaiveBayesClassifier, and the second one is a sklearn.naive_bayes.MultinomialNB (with fit_prior=True).

After training, I evaluated the classifiers on my test set of 30,000 instances and I get the following results:

**NLTK's NaiveBayes**
accuracy: 0.568740
class: 1
     precision: 0.331229
     recall: 0.331565
     F-Measure: 0.331355
class: -1
     precision: 0.079253 
     recall: 0.446331 
     F-Measure: 0.134596 
class: 0
     precision: 0.849842 
     recall: 0.628126 
     F-Measure: 0.722347 


**Scikit's MultinomialNB (with fit_prior=True)**
accuracy: 0.834670
class: 1
     precision: 0.400247
     recall: 0.125359
     F-Measure: 0.190917
class: -1
     precision: 0.330836
     recall: 0.012441
     F-Measure: 0.023939
class: 0
     precision: 0.852997
     recall: 0.973406
     F-Measure: 0.909191

**Scikit's MultinomialNB (with fit_prior=False)**
accuracy: 0.834680
class: 1
     precision: 0.400380
     recall: 0.125361
     F-Measure: 0.190934
class: -1
     precision: 0.330836
     recall: 0.012441
     F-Measure: 0.023939
class: 0
     precision: 0.852998
     recall: 0.973418
     F-Measure: 0.909197

I have noticed that while Scikit's classifier has better overall accuracy and precision, its recall is very low compared to the NLTK one, at least with my data. Taking into account that they might be (almost) the same classifiers, isn't this strange?

What are the features? Did you try a BernoulliNB as well? That should be closer to the NLTK Naive Bayes. — Fred Foo
Thanks for the reply. The features are words with value 1 if they exist in the document (boolean). The results for scikits BernoulliNB are very close to MultinomialNB: accuracy: 0.834680 class: 1 precision: 0.400380 recall: 0.125361 F-Measure: 0.190934 class: -1 precision: 0.330836 recall: 0.012441 F-Measure: 0.023939 class: 0 precision: 0.852998 recall: 0.973418 F-Measure: 0.909197 — D T
The only thing I can see in the documentation is that NLTK's NB classifier apparently doesn't do smoothing. I wouldn't expect that to cause a big difference, though... — Fred Foo

Marc Shivers Marc Shivers · Accepted Answer · 2012-05-06T03:25:02

Is the default behavior for class weights the same in both libraries? The difference in precision for the rare class (-1) looks like that might be the cause...

scikits learn and nltk: Naive Bayes classifier performance highly different

2 Answers