I would like to use Weka to solve my classification problem. I have a set of instances of my training data. Lets say that the data looks like:
@relation Relation1
@attribute att1 {val11, val12}
@attribute att2 {val21, val22}
@attribute class {class1, class2, class3}
@data
val11, val21, class1
val11, val22, class2
val12, val21, class3
In my code I read the training set from the file. I train the J48 tree and try to classify an instance. However, I have no idea how to interpret the results of the classification.
My code is following:
try {
DataSource source = new DataSource("trainingset.arff");
Instances data = source.getDataSet();
if (data.classIndex() == -1) {
data.setClassIndex(data.numAttributes() - 1);
}
Instance xyz = new Instance(data.numAttributes());
xyz.setDataset(data);
xyz.setValue(data.attribute(0), "val11");
xyz.setValue(data.attribute(1), "val21");
String[] options = new String[1];
options[0] = "-U"; // unpruned tree
J48 tree = new J48(); // new instance of tree
tree.setOptions(options); // set the options
tree.buildClassifier(data); // build classifier
double[] distributionForInstance = tree.distributionForInstance(xyz);
System.out.println(distributionForInstance[0]);
System.out.println(distributionForInstance[1]);
System.out.println(distributionForInstance[2]);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
As an output I get:
0.3333333333333333
0.3333333333333333
0.3333333333333333
I also tried other way of classifying the instance:
double classifyInstance = tree.classifyInstance(xyz);
System.out.println(classifyInstance);
In this case the output is:
0.0
Could you explain how should I interpret the outputs from the distributionForInstance and classifyInstance methods? My aim is to be able to create the classifier which would tell me to which class does the given instance belong.