0
votes

I would like to use Weka to solve my classification problem. I have a set of instances of my training data. Lets say that the data looks like:

@relation Relation1

@attribute att1 {val11, val12}
@attribute att2 {val21, val22}
@attribute class {class1, class2, class3}

@data
val11, val21, class1
val11, val22, class2
val12, val21, class3

In my code I read the training set from the file. I train the J48 tree and try to classify an instance. However, I have no idea how to interpret the results of the classification.

My code is following:

try {
    DataSource source = new DataSource("trainingset.arff");
    Instances data = source.getDataSet();
    if (data.classIndex() == -1) {
        data.setClassIndex(data.numAttributes() - 1);
    }

    Instance xyz = new Instance(data.numAttributes());
    xyz.setDataset(data);
    xyz.setValue(data.attribute(0), "val11");
    xyz.setValue(data.attribute(1), "val21");

    String[] options = new String[1];
    options[0] = "-U"; // unpruned tree
    J48 tree = new J48(); // new instance of tree
    tree.setOptions(options); // set the options
    tree.buildClassifier(data); // build classifier

    double[] distributionForInstance = tree.distributionForInstance(xyz);
    System.out.println(distributionForInstance[0]);
    System.out.println(distributionForInstance[1]);
    System.out.println(distributionForInstance[2]);

} catch (Exception e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}

As an output I get:

0.3333333333333333
0.3333333333333333
0.3333333333333333

I also tried other way of classifying the instance:

double classifyInstance = tree.classifyInstance(xyz);
            System.out.println(classifyInstance);

In this case the output is:

0.0

Could you explain how should I interpret the outputs from the distributionForInstance and classifyInstance methods? My aim is to be able to create the classifier which would tell me to which class does the given instance belong.

2
You should review the similar post that mentioning distributionForInstance and classifyInstance.johncasey

2 Answers

0
votes

Have a look at the javadoc. The distributionForInstance method returns an array with class membership probabilities (first element probability of instance being in first class etc) and classifyInstance returns the class (as an ID -- think index into array of class labels).

0
votes

Use value method of Attribute to get class label:

double classifyInstance = tree.classifyInstance(xyz);
String classStr = data.classAttribute().value(classifyInstance);