I've encouraged with next problem: I'm trying to classify a lot of text documents.
There are 20 classes: 1 normal, 19 - abnormal. When I use Naïve bayes classification I have the following result: classification works well for 19 classes, but for "normal" class I got many misclassification errors: almost all cases in "normal" category were classified as other (non-normal) category.
There are my questions:
- How should I select training set for "normal" class? (Now, I just fit to classifier set of text with "normal" category, with 1/20 proportion).
- Can classifier be specified this way: if probability of belonging to
some class less then certain threshold then classifier must set up
category for this sample (e.g. normal)?