1
votes

Actually I am working on Naive Bayes classifier for filtering mails. I have achieved accuracy of 95% in SPAM detection and 94 % in HAM detection, but I believe it can be further improved with association rule mining. I am calculating likelihood and prior probabilities of the words in mails from training data set, and mapping the testing mail to either of SPAM or HAM class as given below,

formula

where,

p(d/c) denotes probability of document d being in class c.

p(c) denotes probability of particular class(SPAM or HAM in my case).

p(f1,f2,f3...fn/c) denotes likelihood of words f1,f2...fn being in class c.

but while arriving at equation no. 2.7, we assume bag of words assumption and condition independence, which approximates accuracy( Which is assumed for sake of easiness).
e.g. Likeliness of word lottery being in SPAM mail with presence of word lucky should be greater than the same with presence of word my_name(mahesh). So presence of words and their position do affect probabilities.

Therefore there should be some associative model in accordance with Naive Bayes to further improve accuracy.

1
Can you improve your question? It is not clear what you're asking. Are you looking to combine association rule mining with a Naïve Bayes, or are you asking if using a regular Bayesian approach (via Bayesian networks for example) would improve your results?C. S.
I am asking that, whether it is possible to combine Association rules with Naive Bayes to improve the results?lsbmsb
Have you seen research like this yet? aaai.org/Papers/KDD/1998/KDD98-012.pdfC. S.

1 Answers

0
votes

If I can rephrase your question like this:

"Will relaxing the conditional independence assumption of Naive Bayes improve my classifier's performance?"

Then the answer is a surprising and counterintuitive "No."

Generally speaking, Naive Bayes classifiers, which impose strict class-conditional independence between features, will offer same-or-better performance than more general Bayesian networks, which allow for richer dependencies (and that dependence structure can even be learned from data, although generally not exactly).

The reason is that, while Naive Bayes will generally get the probabilities wrong, it will generally get the decision boundary right [1].

So: you are probably better off just making the bag-of-words assumption.

[1] http://web.cs.ucdavis.edu/~vemuri/classes/ecs271/Bayesian.pdf