0
votes

I'm learning the Clustering of Weka and trying to get the "Incorrectly clustered instances" after 'ClusterEvaluation' with the python-weka-wrapper 0.3.8 (python 2.7).

Currently, I have the class attribute in the input arff file, build the data model successfully, and can get the output of num of clusters & cluster_assignments & Clustered Instances of each cluster in the training set ...etc (which all followed the example of python-weka-wrapper-examples/src/wekaexamples/clusterers/clusterers.py)

However, I want to know whether each instance clustered correctly as the labeled class of input arff or not, and also the ratio of incorrectly clustered instances (just like the output result from Weka GUI).

I'm not sure where to set the options like this from Weka GUI v3.8.0 /Explore/Cluster/Cluster mode/Classes to Clusters evaluation before I run the weka.clusterers.ClusterEvaluation module.

Is there anyone know how to do the mapClasses(http://weka.sourceforge.net/doc.stable/weka/clusterers/ClusterEvaluation.html#mapClasses-int-int-int:A:A-int:A-double:A-double:A-int-) via python-weka-wrapper?
Or anyway to get the ratio of clustered instances result v.s labeled class attribute?

Appreciate!

1

1 Answers

0
votes

If your Instances object for testing has a class attribute set when you perform cluster evaluation using the ClusterEvaluation class, you can use the classes_to_clusters property to obtain the mapping that has been collected while evaluating the clusterer. The equivalent Java method is ClusterEvaluation.getClassesToClusters. The mapClasses method hasn't been exposed as Python function.