The sum of the incorrect classification (see tree) in all rules is 2097 (which is from 895+700+428+74) . But the confusion matrix is 2121 (which is from 1999+122). Can someone explain the discrepancy? How come the numbers are different?
1 Answers
Weka output of classifier's model description contains two sections
- Error on training data
- Stratified cross-validation
First one just evaluate trained classifier on training data itself whereas second one does the cross-validation where it distribute instances of each class equally in each fold. So stratified cross-validation is supposed to produce better picture of classifier's performance as compared to simple cross-validation.
I think here you have posted Confusion matrix of stratified cross-validation & hence number of miss-classified instances shown in tree(They must be from evaluation on training data) is different.
Decision tree output is very nicely described at link https://weka.wikispaces.com/Primer#classifiers. There also miss-classified examples shown in tree are different from those that can be seen from confusion matrix under stratified cross-validation section.
Hope, I am correct.